35
1 A Survey on Deep Learning based Brain-Computer Interface: Recent Advances and New Frontiers XIANG ZHANG, LINA YAO, University of New South Wales XIANZHI WANG, University of Technology Sydney JESSICA MONAGHAN, DAVID MCALPINE, Macquarie University YU ZHANG, Stanford University Brain-Computer Interface (BCI) bridges the human’s neural world and the outer physical world by decoding individuals’ brain signals into commands recognizable by computer devices. Deep learning has lied the performance of brain-computer interface systems signicantly in recent years. In this article, we systematically investigate brain signal types for BCI and related deep learning concepts for brain signal analysis. We then present a comprehensive survey of deep learning techniques used for BCI, by summarizing over 230 contributions most published in the past ve years. Finally, we discuss the applied areas, opening challenges, and future directions for deep learning-based BCI. Additional Key Words and Phrases: Brain-Computer Interface, deep learning, survey 1 INTRODUCTION Brain-Computer Interface (BCI) 1 systems interpret the human brain paerns into messages or commands to communicate with the outer world [106]. BCI underpins many novel applications that are important to people’s daily life, especially to people with psychological/physical deceases or disabilities. For example, high fake-resistant user identication based on brain waves allows ordinary people to enjoy enhanced entertainment and security [225]; BCI enables the disabled, elders, and other people with limited motion ability to control wheelchairs, home appliances, and robots. e key challenge for BCI is to recognize human intent accurately given meager Signal-to- Noise Ratio (SNR) brain signals. Poor generalization ability and low classication performance limit the broad real-world application of BCI. Past few years have seen deep learning or deep neural networks applied for brain information processing to overcome the above challenges. As an increasing sub-eld of machine learning, deep learning has shown excellent representation learning ability [38] and thereaer been impacting a range of domains such as natural language processing, activity recognition, computer vision, and logic reasoning [195]. Diering from traditional machine learning algorithms, deep learning is empowered to learn distinct high-level representations from raw brain signals without manual feature selection. It also achieves high accuracy and scales well with the size of the training set. 1.1 Why Deep Learning? Although traditional BCI systems have made tremendous progress [2, 18], BCI research still faces signicant challenges. First, brain signals are easily corrupted by various biological (e.g., eye blinks, muscle artifacts, fatigue, and the concentration level) and environmental artifacts (e.g., noises) [2]. erefore, it is crucial to distill informative data from corrupted brain signals and build a robust system that works in dierent situations. Second, BCI faces the low SNR of non-stationary 1 Several terms describe the similar ideas to BCI, such as Brain Machine Interface (BMI) and Brain Interface (BI). 2018. XXXX-XXXX/2018/1-ART1 $15.00 DOI: 10.1145/1122445.1122456 , Vol. 1, No. 1, Article 1. Publication date: January 2018. arXiv:1905.04149v4 [cs.HC] 26 Oct 2019

XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1

A Survey on Deep Learning based Brain-Computer Interface:Recent Advances and New Frontiers

XIANG ZHANG, LINA YAO, University of New South WalesXIANZHI WANG, University of Technology SydneyJESSICA MONAGHAN, DAVID MCALPINE, Macquarie UniversityYU ZHANG, Stanford University

Brain-Computer Interface (BCI) bridges the human’s neural world and the outer physical world by decodingindividuals’ brain signals into commands recognizable by computer devices. Deep learning has li�ed theperformance of brain-computer interface systems signi�cantly in recent years. In this article, we systematicallyinvestigate brain signal types for BCI and related deep learning concepts for brain signal analysis. Wethen present a comprehensive survey of deep learning techniques used for BCI, by summarizing over 230contributions most published in the past �ve years. Finally, we discuss the applied areas, opening challenges,and future directions for deep learning-based BCI.

Additional Key Words and Phrases: Brain-Computer Interface, deep learning, survey

1 INTRODUCTIONBrain-Computer Interface (BCI)1 systems interpret the human brain pa�erns into messages orcommands to communicate with the outer world [106]. BCI underpins many novel applicationsthat are important to people’s daily life, especially to people with psychological/physical deceasesor disabilities. For example, high fake-resistant user identi�cation based on brain waves allowsordinary people to enjoy enhanced entertainment and security [225]; BCI enables the disabled,elders, and other people with limited motion ability to control wheelchairs, home appliances, androbots. �e key challenge for BCI is to recognize human intent accurately given meager Signal-to-Noise Ratio (SNR) brain signals. Poor generalization ability and low classi�cation performancelimit the broad real-world application of BCI.

Past few years have seen deep learning or deep neural networks applied for brain informationprocessing to overcome the above challenges. As an increasing sub-�eld of machine learning, deeplearning has shown excellent representation learning ability [38] and therea�er been impacting arange of domains such as natural language processing, activity recognition, computer vision, andlogic reasoning [195]. Di�ering from traditional machine learning algorithms, deep learning isempowered to learn distinct high-level representations from raw brain signals without manualfeature selection. It also achieves high accuracy and scales well with the size of the training set.

1.1 Why Deep Learning?Although traditional BCI systems have made tremendous progress [2, 18], BCI research still facessigni�cant challenges. First, brain signals are easily corrupted by various biological (e.g., eye blinks,muscle artifacts, fatigue, and the concentration level) and environmental artifacts (e.g., noises)[2]. �erefore, it is crucial to distill informative data from corrupted brain signals and build arobust system that works in di�erent situations. Second, BCI faces the low SNR of non-stationary1Several terms describe the similar ideas to BCI, such as Brain Machine Interface (BMI) and Brain Interface (BI).

2018. XXXX-XXXX/2018/1-ART1 $15.00DOI: 10.1145/1122445.1122456

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

arX

iv:1

905.

0414

9v4

[cs

.HC

] 2

6 O

ct 2

019

Page 2: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1:2 Xiang Zhang, et al.

(a) Paper # of BCI signals (b) Paper # of EEG Signals (c) BCI workflow

Fig. 1. The amount of summarized papers in publication year and BCI signals. Not all papers are counted forthe years 2018 and 2019 due to limited availability of the related data. (c) demonstrates the general workflowof BCI systems.

electrophysiological brain signals [148]. �e low SNR cannot be easily addressed by traditionalpreprocessing or feature engineering methods due to the time complexity of those method and therisk of information loss [223]. �ird, feature engineering highly depends on human expertise inthe speci�c domain. For example, it requires the basic biological knowledge to investigate sleepingstate through Electroencephalogram (EEG) signals. Human experience may help on certain aspectsbut fall insu�cient in more general circumstances. An automatic feature extraction method ishighly desirable. Moreover, most existing machine learning research focuses on static data andtherefore, cannot classify rapidly changing brain signals accurately. For instance, the state-of-the-art classi�cation accuracy for motor imagery EEG is only 60% to 80% [105]. It requires novellearning methods to deal with dynamical data streams in BCI systems.

Until now, deep learning has been applied extensively in BCI applications and shown success inaddressing the above challenges [26, 110]. Deep learning has two advantages. First, it works directlyon raw brain signals, thus avoiding the time-consuming preprocessing and feature engineering.Second, deep neural networks can capture both representative high-level features and latentdependencies through deep structures. Our investigation (Figure 1) shows a surge of publicationsin deep learning based BCI since 2014.

1.2 Why this Survey is Necessary?We conduct this survey for three reasons. First, there lacks a systematic and comprehensiveintroduction of BCI signals. Table 1 shows a summary of the existing survey on BCI. As our bestknowledge, the limited existing surveys [2, 18, 42, 100, 105, 106, 118, 200] only focus on partialEEG signals. For example, Lo�e et al. [105] and Wang et al. [198] focus on EEG without analyzingEEG signal types; Ceco�i et al. [27] focus on Event-Related Potentials (ERP); Haseer et al. [126]focus on functional near-infrared spectroscopy (fNIRS); Mason et al. [118] brief the neurologicalphenomenons like event-related desynchronization (ERD), P300, SSVEP, Visual Evoked Potentials(VEP), Auditory Evoked Potentials (AEP) but have not organized them systematically; Abdulkaderet al. [2] present a topology of brain signals but have not mentioned spontaneous EEG and RapidSerial Visual Presentation (RSVP); Lo�e et al. [106] have not considered ERD and RSVP; Roy et al.[208] list some deep learning based EEG studies but provide li�le analysis.

Second, few research has investigated the association between deep learning ([37, 38, 153]) andBCI ([2, 18, 42, 105, 106, 118]). To the best of our knowledge, this paper is the �rst comprehensivesurvey of recent advances on deep learning-based BCI. We also point out frontiers and promisingdirections in this area.

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 3: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

A Survey on Deep Learning based Brain-Computer Interface 1:3

Table 1. The existing survey on BCI in the last decade. The column ‘comprehensive on signals’ indicateswhether the survey has summarized all the BCI signals or not. MI EEG refers to Motor Imagery EEG signals.

No. Reference Comprehensiveon Signals? Signal Deep Learning Publication

Time Area

1 [42] No EMG, EOG No 2007

2 [200] No fMRI Yes 2018 Mental DiseaseDiagnosis

3 [105] Partial EEG (mainly MI EEG and P300) No 2007 Classi�cation4 [106] Partial EEG (Mainly MI EEG, P300) Partial 2018 Classi�cation5 [118] Partial EEG (ERD, P300, SSVEP, VEP, AEP) No 2007

6 [99] No MRI, CT Partial 2017 Medical ImageAnalysis

7 [208] No EEG Yes 20198 [18] No EEG No 2007 Signal Processing9 [198] Partial EEG No 2016 BCI Applications10 [2] Yes No 201511 [124] No EEG Partial 2018

12 [164] No EEG, fMRI No 2015 Neurorehabilitationof Stroke

13 [5] No MI EEG No 201514 [146] No fMRI No 2014

15 [54] No ERP (P300) No 2017 Applicationsof ERP”

16 [100] No fMRI Yes 2018 Applicationsof fMRI

17 [184] No ERP No 2017 Classi�cation18 [52] Partial EEG No 2019 Brain Biometrics

19 Ours yesInvasive signals,EEG and the subcategories,fNIRS, fMRI, EOG, MEG

Yes

Lastly, previous BCI surveys focus on speci�c areas or applications and lack an overview of broadscenarios. For example, Litjens et al. [99] summarize several deep neural network concepts aimingat medical image analysis; Soekadar et al. [164] review the BCI systems and machine learningmethods for stroke-related motor paralysis based on Sensori-Motor Rhythms (SMR); Vieira et al.[191] investigate the application of BCI systems on the neurological disorder and psychiatric.

1.3 Our ContributionsTo our best knowledge, this survey is the �rst comprehensive survey of the recent advances andfrontiers of deep learning-based BCI. To this end, we have summarized over 240 contributions,most of which were published in the last �ve years (since 2014). We make several key contributionsin this survey:

• We systematically review brain signals and deep learning techniques for BCI to help readersgain a comprehensive understanding of this area of research.• We discuss the popular deep learning techniques and state-of-the-art models for BCI signals,

providing some guidelines for choosing the suitable deep learning models given a speci�csignal type or BCI system.• We review the applications and remaining challenges of deep learning based BCI and

highlight some promising topics for future research.We only brie�y introduce BCI signals and deep learning models in Section 3 and Section 4. More

details about BCI signals and deep learning models are available in an extensive version2.

2h�ps://arxiv.org/abs/1905.04149

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 4: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1:4 Xiang Zhang, et al.

Table 2. Summary of various brain signals’ characteristics.

Signals Invasive Non-invasiveIntracortical EcoG EEG fNIRS fMRI EOG MEG

Risk High High Low Low Low Low LowTemporal resolution High High Mediate Low Low Mediate HighSpatial resolution Very High High Low Mediate High Low MediateCharacteristic Electrical Electrical Electrical Optical Metabolic Electrical MagneticPortability Mediate Mediate High High Low High LowSignal-to-Noise Ratio High High Low Low Mediate Mediate LowCost High High Low Low High Low High

2 GENERAL BCI SYSTEMFigure 1c shows the general paradigm of a BCI system, which receives brain signals and convertsthem into control commands for computers. �e system includes several key components: brainsignal collection, signal preprocessing, feature engineering, classi�cation, and smart equipment.

First, the system collects and preprocesses brain signals from humans. �e preprocessing isnecessary for denoising and enhancing the low SNR signals. For example, collecting EEG signalsrequires placing a series of electrodes on the scalp of the human head to record the brain’s electricalinformation. Since the ionic current is erupted within the brain but measured at the scalp, the skullcould heavily decrease SNR. �e preprocessing component contains multiple steps such as signalcleaning, signal normalization, signal enhancement, and signal reduction.

�en, the system conducts feature engineering by extracting discriminating features from pro-cessed signals. Traditional features are extracted from the time domain (e.g., variance, mean value,kurtosis), frequency domain (e.g., fast Fourier transform), or time-frequency domain (e.g., discretewavelet transform). �e features enrich the distinguishable information regarding user intention.Feature engineering is highly dependent on the domain knowledge. For example, it requiresbiomedical knowledge to learn informative features from brain signals for the diagnosis of epilepticseizure. �e manual feature extraction is also time-consuming and challenging. Recently, deeplearning provides a be�er option for extracting the distinguishable features automatically.

Finally, a classi�er recognizes signals based on the extracted features and converts them intoexternal device commands. Deep learning algorithms are shown more powerful in many casesthan traditional machine learning methods.

BCI has vast existing and potential applications for both normal people and the disabled. Forexample, patients with motion di�culties can control household appliances using their brain signalsvia a BCI system. Such a system can serve various purposes such as entertainment and security.We will show an introduction of deep learning based BCI applications in Section 7.

3 BCI SIGNALSIn this section, we present a comprehensive introduction of brain signals used in BCI systems.Figure 2 shows a taxonomy of brain signals based on the signal collection method. In particular,invasive signals are collected from places over or below the cortex surface; non-invasive signalsare collected by external sensors. Table 2 summarizes the characteristics of various brain signals.

Invasive signals include intracortical signals and Electrocorticography (ECoG)—the former re-quires inserting electrodes into the cortex of the brain [134]; the la�er involves a�aching electrodesunder the skull outside of the brain parenchyma [88].

Non-invasive signals divides into Electroencephalogram (EEG), Functional near-infrared spec-troscopy (fNIRS), Functional magnetic resonance imaging (fMRI), Electrooculography (EOG), and

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 5: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

A Survey on Deep Learning based Brain-Computer Interface 1:5

BCI Signals

Intracortical

Non-InvasiveSignals

InvasiveSignals

MEG

EOG

fMRI

fNIRS

Evoked Potential(EP)

Event RelatedDesynchronization/

Synchronization(ERD/ERS)

Steady StateEvoked Potentials

(SSEP)

Steady StateVisually Evoked

Potentials(SSVEP)

Steady StateAuditory Evoked

Potentials(SSAEP)

Steady StateSomatosensory

Evoked Potentials(SSSEP)

EEG

ECoG

Event-RelatedPotiental

(ERP)

Visual EvokedPotential (VEP)

Auditory EvokedPotential (AEP)

SomatosensoryEvoked Potential

(SEP)

Rapid Serial VisualPresentation (RSVP)

Rapid Serial AuditoryPresentation (RAVP)

Spontaneous EEG

Sleeeping

Motor Imagery

Emotional EEG

Mental Disease

Others

Fig. 2. The biometric signals generally used in BCI system. The dash quadrilaterals (Intracortical, RAVP, SEP,SSAEP, and SSSEP) are not included in this survey for the reason that there’s no existing work focus on themby deep learning algorithms. P300, a significant potential appears at 300 ms a�er the presented stimuli, is notlisted in this signal tree because it is included by ERP (which refers to all the potentials a�er the presentedstimuli).

Magnetoencephalography (MEG) [12]. We focus on EEG signals and related subcategories as theydominate among non-invasive signals. EEG monitors the voltage �uctuations generated by anelectrical current within human neurons. �e electrodes a�ached on the scalp can measure varioustypes of EEG signals, including spontaneous EEG [137], evoked potentials (EP), and event-relateddesynchronization/synchronization (ERD/ERS) [69]. Depending on the scenario, spontaneousEEG further diverges into sleeping EEG, motor imagery EEG, emotional EEG, mental diseaseEEG, etc. Similarly, EP divides into event-related potentials (ERP) [27]and steady-state evokedpotentials (SSEP) [143] according to the frequency of external stimuli. Each potential containsvisual-, auditory-, or somatosensory-potentials based on the external stimuli types.

Regarding the other Non-invasive signals, fNIRS produces functional neuroimages by employingnear-infrared (NIR) light to measure the aggregation degree of oxygenated hemoglobin (Hb) and

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 6: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1:6 Xiang Zhang, et al.

Deep LearningModels

Generative Models Hybrid Models Representative

Models Discriminative

Models

VAE GANMLP CNN

GRULSTM

RNN RBM DBNAE

DBN-AE DBN-RBMD-RBMD-AE

Fig. 3. Deep learning models. They can be divided into discriminative, representative, generative and hybridmodels based on the algorithm function. D-AE denotes Stacked-Autoencoder which refers to the autoencoderwith multiple hidden layers. Deep Belief Network can be composed of AE or RBM, therefore, we divided DBNinto DBN-AE (stacked AE) and DBN-RBM (stacked RBM).

deoxygenated-hemoglobin (deoxy-Hb), both of which have higher absorbers of light than otherhead components such as skull and scalp [127]; fMRI monitors brain activities by detecting theblood �ow changes in brain areas [200]; EOG is promising for controlling physical devices usingeyes [206]; MEG re�ects brain activities via magnetic changes [163].

4 DEEP LEARNING MODELSIn this section, we brie�y present common used deep learning algorithms, including concepts,architectures, and techniques used in BCI. Deep learning is the branch of machine learning tech-niques that learns representations and hereby accomplishing advanced tasks based on multipleneural layers. While the standard architecture of neural network contains an input layer, a hiddenlayer, and an output layer, a “deep” neural network generally contain more than one hidden layers.

Deep learning falls into several categories based on the aim (Figure 3):• Discriminative deep learning models, which classify the input data into known categories

by learning discriminative features adaptively. Discriminative algorithms learn distinctivefeatures through non-linear transformation and classi�cation based on probabilistic predic-tion. Discriminative algorithms can be used for both feature engineering and classi�cation(see Figure 1c). �ey mainly include Multi-Layer Perceptron (MLP), Recurrent NeuralNetworks (RNN) (including Long Short-Term Memory (LSTM) and Gated Recurrent Units(GRU)), Convolutional Neural Networks (CNN), and their variations.• Representative deep learning models. Such models learn pure representative features

from the input data. Such algorithms only serve for feature engineering rather thanclassi�cation. Commonly used algorithms of this category include Autoencoder (AE),Restricted Boltzmann Machine (RBM), Deep Belief networks (DBN), and their variations.• Generative deep learning models. Such models learn the joint probability distribution

of the input data along with the target label. Generative algorithms are mostly used inreconstruction or generate a batch of brain signals samples to enhance the training set inBCI. �e commonly used models of this category include variational Autoencoder (VAE)3

and Generative Adversarial Networks (GANs).• Hybrid deep learning models. Such models combine more than two deep learning mod-

els. �ere are two typical hybrid deep learning models: 1) the combination of LSTM3VAE is a variation of AE. However, they are working on di�erent aspects.

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 7: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

A Survey on Deep Learning based Brain-Computer Interface 1:7

Table 3. Summary of deep learning model types

Deep Learning Input Output Function Training methodDiscriminative Input data Label Feature extraction, Classi�cation SupervisedRepresentative Input data Representation Feature extraction UnsupervisedGenerative Input data New Sample Generation, Reconstruction UnsupervisedHybrid Input data – – –

and CNN, which is popular in spatial-temporal feature extraction; 2) the combination ofa representation algorithm (for feature extraction) and a discriminative algorithm (forclassi�cation).

Table 3 summarizes the characteristics of each deep learning category. Note that, we do not regardthe so�max layer as an algorithmic component because almost all classi�cation functions in neuralnetworks are implemented using a so�max layer. For instance, a model that combines a DBN and aso�max layer will be considered as a representative model instead of a hybrid model in this survey.

5 STATE-OF-THE-ART DL TECHNIQUES FOR BCIIn this section, we systematically summarize existing studies for BCI based on deep learning.

5.1 Intracortical and ECoGFew research on intracortical brain signal and ECoG is based on deep learning algorithms. Currently,intracortical brain signals are mainly investigated by medical and biological researchers, who payli�le a�ention to deep learning. Antoniades et al. [10] employed CNN with two convolutionallayers to extract features from epileptic intracortical data for detecting interictal epileptic discharge(IED). �ey sliced the input data into many 80 ms segments with 40 ms overlapping and achievedan accuracy about 87.51%. �eir later work [11] used a deep neural architecture to map the scalpsignals to pseudo-intracranial brain signals.

Most of the ECoG related studies focused on medical healthcare, especially epileptic seizurediagnosis. For example, Hosseini et al. [67] worked on seizure diagnosis and localization based onbandpass �ltered (0.5 ∼ 150 Hz) EEG and ECoG signals. A�er that, the authors manually extractedfeatures through Principal Component Analysis (PCA), ICA, and Di�erential Search Algorithm(DSA) and then fed the features to two di�erent deep learning structures (CNN and DBN-AE) andreported that CNN achieved a higher accuracy [67].

Kiral-Kornek et al. [80] proposed an MLP algorithm for an epileptic seizure prediction systemoperatable on a wearable device for the ultra-low power requirement and achieved a sensitivityof 69%.. Apart from the seizure diagnosis, Xie et al. [203] focused on the �nger trajectory fromECoG signals and developed a hybrid deep neural network based on convolutional layer and LSTMcells. However, the contribution of this paper is that they employed CNN for not only spatialconvolution but also temporal convolution. �e convolution operation produced �xed-lengthvector representations to send to the LSTM cell for trajectory tracking. Each ECoG segment lastsfor 1 second with 40 ms overlapping. �us, the model can receive a stream of ECoG and form acomplete �nger trajectory.

5.2 EEGMore than half of the recent publications are related to EEG signals for the non-invasive, high-portable and low-price. In this section, we summarize three aspects of EEG signals: EEG oscillatory,evoked potentials, and ERD/ERS.

5.2.1 EEG Oscillatory. We present the deep learning models for spontaneous EEG according tothe application scenarios as follows.

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 8: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1:8 Xiang Zhang, et al.

(1) Sleeping EEG. Sleep EEG is mainly used for recognizing the sleep stage and diagnosingsleep disorders or cultivating the healthy habit [31, 216].

(i) Discriminative models. CNN are frequently used for sleep stage classi�cation on single-channelEEG [166, 184]. For example, Viamala et al. [192] manually extracted the time-frequency featuresand achieved a classi�cation accuracy of 86%. Others used RNN [23] and LSTM [185] based onvarious features from the frequency domain, correlation, and graph theoretical features.

(ii) Representative models. Tan et al. [180] adopted a DBN-RBM algorithm to detect sleep spindlebased on PSD features extracted from sleeping EEG signals and achieved an F-1 of 92.78% on alocal dataset. Zhang et al. [216] further combined DBN-RBM with three RBMs for sleeping featureextraction.

(iii) Hybrid models. Manzano et al. [113] presented a multi-view algorithm in order to predictsleep stage by combining CNN and MLP. �e CNN was employed to receive the raw time-domainEEG oscillations while the MLP received the spectrum singles processed by the Short-Time FourierTransform (STFT) among 0.5-32 Hz. Supratak et al. [176] proposed a model by combing a multi-viewCNN and LSTM for automatic sleep stage scoring, in which the former was adopted to discovertime-invariant dependencies while the la�er (a bidirectional LSTM) was adopted the temporalfeatures during the sleep. Dong et al. [39] proposed a hybrid deep learning model aiming attemporal sleep stage classi�cation and took advantage of MLP for detecting hierarchical featuresalong with LSTM for sequential information learning.

(2) MI EEG. Deep learning models have shown the superior on the classi�cation of Motor-Imagery (MI) EEG and real-motor EEG [59, 129].

(i) Discriminative models. Such models mostly use CNN to recognize MI EEG [219]. Some arebased on manually extracted features [74, 207]. For instance, Lee et al. [87] and Zhang et al.[220] employed CNN and 2-D CNN, respectively, for classi�cation; Zhang et al. [220] learneda�ective information from EEG signals to based on a modi�ed LSTM control smart home appliances.Others also used CNN for feature engineering [181]. For example, Wang et al. [197] �rst usedCNN to capture latent connections from MI-EEG signals and then applied weak classi�ers tochoose important features for the �nal classi�cation; Hartmann et al. [59] investigated how CNNrepresented spectral features through the sequence of the MI EEG samples. MLP has also beenapplied for MI EEG recognition [172], which showed higher sensitivity to EEG phase features atearlier stages and higher sensitivity to EEG amplitude features at later stages.

(ii) Representative models. DBN is widely used as a basis for MI EEG classi�cation for its highrepresentative ability [84, 107]. For example, Ren et al. [144] applied a convolutional DBN basedon RBM components, showing be�er feature representation than hand-cra�ed features.; Li et al.[89] processed EEG signals with discrete wavelet transformation and then applied a DBN-AE basedon denoising AE. Other models include the combination of AE model (for feature extraction) anda KNN classi�er [142] the combination of Genetic Algorithm (for hyper-parameter tuning) andMLP (for classi�cation) [128], the combination AE and XGBoost for multi-person scenarios [221],and the combination of LSTM and reinforcement learning for multi-modality signal classi�cation[224, 226].

(iii) Hybrid models. Several studies proposed hybrid models for the recognition of MI EEG [35].For example, Fraiwan et al. [44] combined DBN with MLP for neonatal sleep state identi�cation;Tabar et al. [177] extracted high-level representations from the time, frequency domain and locationinformation of EEG signals using CNN and then used a DBN-AE with seven AEs as the classi�er;Tan et al. [179] used a denoising AE for dimensional reduction, a multi-view CNN combined withRNN for discovering latent temporal and spatial information, and �nally achieved an averageaccuracy of 72.22% on a public dataset.

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 9: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

A Survey on Deep Learning based Brain-Computer Interface 1:9

(3) Emotional EEG. �e emotion of an individual can be evaluated in three aspects: valence,arousal, and dominance. �e combination of the three aspects form emotions such as fear, sadness,and anger, which can be revealed by EEG signals.

(i) Discriminative models. MLP are traditionally used [45, 209] while CNN and RNN are increas-ingly popular in EEG based emotion prediction [91, 104]. Typical CNN-based work in this categoryincludes hierarchical CNN [91, 92] and augmenting the training set for CNN [194]. Li et al. [91]were the �rst to propose capturing the spatial dependencies among EEG channels via convertingmulti-channel EEG signals into a 2-D matrix. Besides, Talathi [178] used a discriminative deeplearning model composed of GRU cells. Zhang et al. [218] proposed a spatial-temporal recurrentneural network, which employs a multi-directional RNN layer to discover long-range contextualcues and a bi-directional RNN layer to capture the temporal features of the sequences produced bythe previous spatial RNN.

(ii) Representative models. DBN, especially DBN-RBM, is widely used for the unsupervisedrepresentation ability in emotion recognition [47, 93, 96]. For instance, Xu et al. [204, 205] proposeda DBN-RBM algorithm with three RBMs and an RBM-AE to predict a�ective state; Zhao et al.[228] and Zheng et al. [229] cobmined DBN-RBM with SVM and Hidden Markov Model (HMM),respectively, addressing the same problem; Zheng et al. [230, 231] introduced a D-RBM with �vehidden RBM layers to search the important frequency pa�erns and informative channels in a�ectionrecognition; Jia et al. [73] eliminated channels with high errors and then used D-RBM for a�ectivestate recognition based on representative features of the residual channels.

�e emotion is a�ected by many subjective and environmental factors (e.g., gender and fatigue).Yan et al. [103] investigated the discrepancy of emotional pa�erns between men and women byproposing a novel model called Bimodal Deep AutoEncoder (BDAE) which received both EEGand eye movement features and shared the information in a fusion layer which connected withan SVM classi�er. �e results showed that the females have higher EEG signal diversity on thefearful emotion while males on sad emotion. Moreover, for women, the inter-subject di�erencesin fear is more signi�cant then other emotions [103]. To overcome the mismatched distributionamong the samples collected from di�erent subjects or di�erent experimental sessions, Chai et al.[30] proposed an unsupervised domain adaptation technology which is called subspace alignmentautoencoder (SAAE) by combing an AE and a subspace alignment solution. �e proposed approachobtained a mean accuracy of 77.88% in person independent scenario.

(iii) Hybrid models. One common-used hybrid model is a combination of RNN and MLP. Forexample, Alhagry et al. [7] employed an LSTM architecture for feature extraction from emotionalEEG signals and the features are forwarded into an MLP for classi�cation. Furthermore, Yin et al.[212] proposed a multi-view ensemble classi�er to recognize individual emotions using multimodalphysiological signals. �e ensemble classi�er contains several D-AEs with three hidden layers anda fusion structure. Each D-AE receives one physiological signal (e.g., EEG, EOG, EMG) and thensends the outputs of D-AE to a fusion structure which is composed of another D-AE. At last, anMLP classi�er classi�es the mixed features. Kawde et al. [78] implemented an a�ect recognitionsystem by combining a DBN-RBM for e�ective feature extraction and an MLP for classi�cation.

(4) Mental Disease EEG. A large number of researchers exploited EEG signals to diagnoseneurological disorders, especially epileptic seizure [215].

(i) Discriminative models. �e CNN is widely used in the automatic detection of epileptic seizure[4, 152, 189, 196]. For example, Johansen et al. [75] adopted CNN to work on the high-passed (1Hz) EEG signals of epileptic spike and achieved an AUC of 94.7%. Acharya et al. [3] employed aCNN model with 13 layers on depression detection, which was evaluated on a local dataset with 30subjects and achieved the accuracies of 93.5% and 96.0% based on the le�- and right- hemisphere

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 10: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1:10 Xiang Zhang, et al.

EEG signals, respectively. Morabito et al. [122] tried to exploit a CNN structure to extract suitablefeatures of multi-channel EEG signals to classify Alzheimer’s Disease from the patients with MildCognitive Impairment and healthy control group. �e EEG signals are �ltered in bandpass (0.1 ∼ 30Hz) and achieved an accuracy of around 82% for three-class classi�cation. REM Behavior Disorder(RBD) may cause many mental disorder diseases like Parkinson’s disease (PD). Ru�ni et al. [145]described an Echo State Networks (ESNs) model, a particular class of RNN, to distinguish RBDfrom healthy individuals. In some research, the discriminative model is only employed for featureextraction. For example, Ansari et al. [9] used CNN to extract the latent features and fed into aRandom Forest classi�er for the �nal seizure detection of neonatal babies. Chu et al. [33] combinedCNN and a traditional classi�er for schizophrenia recognition.

(ii) Representative models. For disease detection, one commonly used method is adopting arepresentative model (e.g., DBN) followed by a so�max layer for classi�cation [187]. Page et al.[132] adopted DBN-AE to extract informative features from seizure EEG signals. �e extractedfeatures were fed into a traditional logistic regression classi�er for seizure detection. Al et al.[6] proposed a multi-view DBN-RBM structure to analyze EEG signals from depression patients.�e proposed approach contains multiple input pathways, composed of two RBMs, while eachcorresponded to one EEG channel. All the input pathways would merge into a shared structurewhich is composed of another RBMs. Some papers would like to preprocess the EEG signals throughdimensionality reduction methods such as PCA and ICA [65] while others prefer to direct fed theraw signals to the representative model [97]. Lin et al. [97] proposed a sparse D-AE with threehidden layers to extract the representative features from epileptic EEG signals while Hosseini et al.[65] adopted a similar sparse D-AE with two hidden layers.

(iii) Hybrid models. A popular hybrid method is a combination of RNN and CNN. Shah et al.[155] investigated the performance of CNN-LSTM on seizure detection a�er channel selection andthe sensitivities range from 33% to 37% while false alarms ranges from 38% to 50%. Golmohammadiet al. [49] proposed a hybrid architecture for automatic interpretation of EEG by integratingboth the temporal and spatial information. 2D and 1D CNNs capture the spatial features whileLSTM networks capture the temporal features. �e authors claimed a sensitivity of 30.83% and aspeci�city of 96.86% on the well-known TUH EEG seizure corpus. In the detection of early-stageCreutzfeldt-Jakob Disease (SJD), Morabito et al. [123] combined D-AE and MLP together. �e EEGsignals of SJD were �rst �ltered by bandpass (0.5∼70 Hz) and then fed into a D-AE with two hiddenlayers for feature representation. At last, the MLP classi�er obtained the accuracy of 81∼ 83%in a local dataset. Convolutional autoencoder, replacing the fully-connected layers in a standardAE by convolutional and de-convolutional layers, is applied to extract the seizure features in anunsupervised manner [201].

(5) Data augmentation. �e generative models such as GAN could be used for data augmenta-tion in BCI classi�cation [1]. Palazzo et al. [133] �rst demonstrated that the information containedin the brainwaves are empowered to distinguish the visual object and then extracted a more robustand distinguishable representation of EEG data using RNN. At last, they employed the GAN par-adigm to train a image generator conditioned by the learned EEG representations, which couldconvert the EEG signals into images [133]. Kavasidis et al. [76] aiming at converting EEG signalsinto images. �e EEG signals were collected when the subjects were observing images on a screen.An LSTM layer was employed to extract the latent features from the EEG signals, and the extractedfeatures were regarded as the input of a GAN structure. �e generator and the discriminator ofthe GAN were both composed of convolutional layers. �e generator was supposed to generatean image based on the input EEG signals a�er the pre-training. Abdelfa�ach et al. [1] adopteda GAN on seizure data augmentation. �e generator and discriminator are both composed of

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 11: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

A Survey on Deep Learning based Brain-Computer Interface 1:11

fully-connected layers. �e authors demonstrated that GAN outperforms AE and VAE. A�er theaugmentation, the classi�cation accuracy increased dramatically from 48% to 82%.

(6) Others. Some researches have explored a wide range of exciting topics. �e �rst one ishow EEG a�ected by audio/visual stimuli. �is di�ers from the potentials evoked by audio/visualstimulations because the stimuli in this phenomenon always exist instead of �icking in a particularfrequency. Stober et al. [169, 170] claimed that the rhythm-evoked EEG signals are informativeenough to distinguish the rhythm stimuli. �e authors conducted an experiment where 13 partici-pants were stimulated by 23 rhythmic stimuli, including 12 East African and 12 Western stimuli.For the 24-category classi�cation, the proposed CNN achieved a mean accuracy of 24.4%. A�erthat, the authors exploited convolutional AE for representation learning and CNN for recognitionand achieved an accuracy of 27% for 12-class classi�cation [171]. Sternin et al. [168] adopted CNNto capture discriminative features from the EEG oscillations to distinguish whether the subject waslistening or imaging music. Similarly, Sarkar et al. [150] designed two deep learning models torecognize the EEG signals aroused by audio or visual stimuli. For this binary classi�cation task,the proposed CNN and DBN-RBM with three RBMs achieved the accuracy of 91.63% and 91.75%,respectively. Furthermore, the spontaneous EEG could be used to distinguish the user’s mentalstate (logical versus emotional) [21].

Moreover, some researchers focus on the impact of cognitive load on EEG [159] or physicalworkload [50]. Bashivan et al. [19] �rst extract informative features through wavelet entropy andband-speci�c power, which would be fed into a DBN-RBM for further re�ning. At last, an MLPis employed for cognitive load level recognition. �e authors, in another work [20], also denotedto �nd the general features which are constant in inter-/intra- subjects scenarios under variousmental load. Yin et al. [211] collected the EEG signals from di�erent mental workload levels (e.g.,high and low) for binary classi�cation. �e EEG signals are �ltered by a low-pass �lter, transformedto the frequency domain and be calculated the power spectral density (PSD). �e extracted PSDfeatures were fed into a denoising D-AE structure for future re�ning. �ey �nally got an accuracyof 95.48%. Li et al. [94] worked on the recognition of mental fatigue level, including alert, slightfatigue, and severe fatigue.

In addition, EEG based driver fatigue detection is an a�ractive area [29, 40, 56]. Huang et al.[70] designed a 3D CNN to predict the reaction time in drowsiness driving. �is is meaningfulto reduce tra�c accident. Hajinoroozi et al. [55] adopted a DBN-RBM to handle the EEG signalswhich were processed by ICA. �ey achieved an accuracy of around 85% in binary classi�cation(‘drowsy’ or ‘alert’). �e strength of this paper is that they evaluated the DBN-RBM on three levels:time samples, channel epochs, and windowed samples. �e experiments illustrated that the channelepoch level outperformed the other two levels. San et al. [149] combined deep learning modelswith a traditional classi�er to detect driver fatigue. �e model contains a DBN-RBM structurefollowed by an SVM classi�er, which achieved the detection accuracy of 73.29%. Almogbel et al.[8] investigated the drivers’ mental state under di�erent low workload levels. A proposed CNN isclaimed to detect the driving workload directly based on the raw EEG signals.

�e research of the detection of eye state has shown exceeding accuracy. Narejo et al. [125]explored the detection of eye state (closed or open) based on EEG signals. �ey tried a DBN-RBMwith three RBMs and a DBN-AE with three AEs and achieved a high accuracy of 98.9%. Reddy et al.[141] tried a simpler structure, MLP, and got a slightly lower accuracy of 97.5%.

�ere are several overlooked yet promising areas. Baltatzis et al. [16] adopted CNN to detectschool bullying through the EEG when watching the speci�c video. �ey achieved 93.7% and 88.58%for binary and four-class classi�cation. Khurana et al. [79] proposed deep dictionary learning thatoutperformed several deep learning methods. Volker et al. [193] evaluated the use of Deep CNN in

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 12: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1:12 Xiang Zhang, et al.

�anker task, which achieved an averaging accuracy of 84.1% within subject and 81.7 on the unseensubject. Zhang et al. [222] combined CNN and graph network to discover the latent informationfrom the EEG signal.

Miranda-Correa et al. [121] proposed a cascaded framework by combing RNN and CNN topredict individuals’ a�ective level and personal factors (Big-�ve personality traits, mood, and socialcontext). An experiment conducted by Pu�en et al. [139] a�empted to identify the user’s genderbased on their EEG signals. �ey employed a standard CNN algorithm and achieved the binaryclassi�cation accuracy of 81% over a local dataset. �e detection of emergency braking intentioncould help to reduce the responses time. Hernandez et al. [63] demonstrated that the driver’sEEG signals could distinguish braking intention and normal driving state. �ey combined a CNNalgorithm which achieved the accuracy of 71.8% in binary classi�cation. Behncke et al. [22] applieddeep learning, a CNN model, in the context of robot assistive devices. �ey a�empted to use CNNto improve the accuracy of decoding robot errors from EEG while the subject was watching therobot both during an object grasping and a pouring task.

Teo et al. [182] tried to combine the BCI and recommender system, which predicted the user’spreference by EEG signals. �ere were sixteen participants took the experiments which collectedthe EEG signals when the subject was presented 60 bracelet-like objects as rotating visual stimuli(a 3D object). �en, an MLP algorithm was adopted to classify the user like or dislike the object.�is exploration got the prediction accuracy of 63.99%. Some researchers have tried to explore acommon framework which can be used for various BCI paradigms. Lawhern et al. [86] introducedEEGNet based on a compact CNN and evaluated its robustness in various BCI contexts [86].

5.2.2 Evoked Potential. (1) ERP. In most situations, the ERP signals are analyzed through P300phenomena. Meanwhile, almost all the studies on P300 are based on the scenario of ERP. �erefore,in this section, a majority of the P300 related publications are introduced in the subsection ofVEP/AEP according to the scenario.

(i) VEP. VEP is one of the most popular subcategories of ERP [54, 167, 210]. Ma et al. [120]worked on motion-onset VEP (mVEP) by extracting representative features through deep learningand adopted genetic algorithm combined with a multi-level sensing structure to compress the rawsignals. �e compressed signals were sent to a DBN-RBM algorithm to capture the more abstracthigh-level features. Maddula et al. [109] �ltered the P300 signals with visual stimuli by a bandpass�lter (2 ∼ 35 Hz) and then fed into a proposed hybrid deep learning model for further analysis. �emodel includes a 2D CNN structure to capture the spatial features followed by an LSTM layer fortemporal feature extraction. Liu et al. [102] combined a DBN-RBM representative model with anSVM classi�er for concealed information test and achieved a high accuracy of 97.3% over a localdataset. Gao et al. [46] employed an AE model for feature extraction followed by an SVM classi�er.In the experiment, each segment contains 150 points, which were divided into �ve time-steps, andeach step had 30 points. �is model achieved an accuracy of 88.1% over a local dataset. A wide rangeof P300 related studies is based on P300 speller [158], which allows the user to write characters.Ceco�i et al. [28] tried to increase the P300 detection accuracy for more precise word-spelling.A new model was presented based on CNN, which including �ve low-level CNN classi�ers withthe di�erent feature set, and the �nal high-level results are voted by the low-level classi�ers. �ehighest accuracy reached 95.5% over the dataset II from the third BCI competition. Liu et al. [101]proposed a Batch Normalized Neural Network (BN3) which is a variant of CNN in P300 speller. �eproposed method consists of six layers, and the batch normalization was operated in each batch.Kawasaki et al. [77] employed an MLP model to detect P300 segments from non-P300 segmentsand achieved the accuracy of 90.8%.

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 13: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

A Survey on Deep Learning based Brain-Computer Interface 1:13

(ii)AEP.A few works focused on the recognition of AEP. For example, Carabez et al. [24] proposedand tested 18 CNN structures to classify single-trial AEP signals. In the experiment, the volunteerswere required to wear on an earphone which produces auditory stimulus designed based on theoddball paradigm. �e experimental analysis demonstrated that the CNN frameworks, regardlessof the number of convolutional layers, were e�ective to extract the temporal and spatial featuresand provided competitive results. �e AEP signals are �ltered by 0.1 ∼ 8 Hz and downsampledfrom 256 Hz to 25 Hz. �e experimental results showed that the downsampled data works be�er.

(iii) RSVP. Among various VEP diagrams, RSVP has a�racted much a�ention [51]. In the analysisof RSVP, a number of discriminative deep learning models (e,g., CNN [28, 98, 165] and MLP [114])has achieved a big success. A common preprocessing method used in RSVP signals is frequency�ltering. �e pass bands are generally ranged from 0.1 ∼ 50 Hz [111, 157]. Ceco�i et al. [26]worked on the classi�cation of ERP signals in RSVP scenario and proposed a CNN algorithm forthe detection of the speci�c target in RSVP. In the experiment, the images of faces and cars wereregarded as target or non-target, respectively. �e image presenting frequency is 2 Hz. In eachsession, the target probability was 10%. �e proposed model o�ered an AUC of 86.1%. Hajinorooziet al. [57] adopted a CNN model targeting the inter-subject and inter-task detection of RSVP.�e experimental results showed that CNN worked good in cross-task but failed to get satisfyingperformance in the cross-subject scenario. Mao et al. [115] compared three di�erent deep neuralnetwork algorithms in the prediction of whether the subject had seen the target or not. �e MLP,CNN, and DBN models obtained the AUC of 81.7%, 79.6%, and 81.6%, respectively. �e author alsoapplied a CNN model to analyze the RSVP signals for person identi�cation [116].

�e representative deep learning models are also applied in RSVP. Vareka et al. [190] veri�ed ifdeep learning performs well for single trial P300 classi�cation. �ey conducted an RSVP experimentwhile the subjects were asked to recognize the target from non-target and distracters. �en aDBN-AE was implemented and compared with some non-deep learning algorithms. �e DBN-AEwas composed of �ve AEs while the hidden layer of the last AE only has two nodes which can beused for classi�cation through so�max function. Finally, the proposed model achieved the accuracyof 69.2%. Manor et al. [112] applied two deep neural networks to deal with the RSVP signals a�erlowpass �ltering (0 ∼ 51 Hz). Discriminative CNN achieved the accuracy of 85.06%. Meanwhile,the representative convolutional D-AE achieved the accuracy of 80.68%.

(2) SSEP. Most of deep learning based studies in SSEP area focus on SSVEP like [85]. SSVEP refersto brain oscillations evoked by the �ickering visual stimuli, which generally produced from theparietal and occipital regions [199]. A�ia et al. [14] aimed at �nding an intermediate representationof SSVEP. A hybrid method combined CNN and RNN was proposed to capture the meaningfulfeatures from the time domain directly, which achieved the accuracy of 93.59%. Waytowich etal. [199] applied a compact CNN model to directly work on the raw SSVEP signals without anyhand-cra�ed features. �e reported cross subject mean accuracy was approximately 80%. �omaset al. [183] �rst �lter the raw SSVEP signals through a bandpass �lter (5 ∼ 48 Hz) and then operateddiscrete FFT on consecutive 512 points. �e processed data were classi�ed by a CNN (69.03%) andan LSTM (66.89%) independently.

Perez et al. [136] adopted a representative model, a sparse AE, to extract the distinct featuresfrom the SSVEP from multi-frequency visual stimuli. �e proposed model employed a so�maxlayer for the �nal classi�cation and achieved the accuracy of 97.78%. Kulasingham et al. [83]classi�ed SSVEP signals in the context of guilty knowledge test. �e authors applied DBN-RBMand DBN-AE independently and achieved the accuracy of 86.9% and 86.01%, respectively. Hachemet al. [53] investigated the in�uence of fatigue on SSVEP through an MLP model during wheelchairnavigation. �e goal of this study was to seek the key parameters to switch between manual,

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 14: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1:14 Xiang Zhang, et al.

semi-autonomous, and autonomous wheelchair command. Aznan et al. [15] explored the SSVEPclassi�cation, where the signals were collected through dry electrodes. �e dry signals were morechallenging for the lower SNR then standard EEG signals. �is study applied a CNN discriminativemodel and achieved the highest accuracy of 96% over a local dataset.

5.2.3 ERD/ERS. ERD/ERS, which mainly appears in sensory, cognitive and motor procedures,is not widely used in BCI due to the drawbacks like unstable accuracy cross subjects [69]. Inmost situations, the ERD/ERS is regarded as a speci�c feature of EEG powers for further analysis[35, 177]. In particular, the ERD/ERS were calculated as relative changes in power concerningbaseline: ERD/ERS = (Pe − Pb )/Pb , where Pe denotes the signals power over one-second segmentwhen the event occurring and Pb denotes the signal power in a one-second segment during baselinewhich is before the event [32]. Generally, the baseline refers to the rest state. For example, Sakhaviet al. calculated the ERD/ERS map and analyzed the di�erent pa�erns among di�erent tasks. �eanalysis demonstrated that the dynamic of energy should be considered because static energy doesnot informative enough [147].

5.3 fNIRSUp to now, only a few of researchers paid a�ention on deep learning based fNIRS. Naseer et al.[127] analyzed the di�erence between two mental tasks (mental arithmetic and rest) based onfNIRS signals. �e authors manually extracted six features from the prefrontal cortex fNIRS andcompared six di�erent classi�ers. �e results demonstrated that the MLP with the accuracy of96.3% outperformed all the traditional classi�ers, including SVM, KNN, naive Bayes, etc. Huve etal. [71] classi�ed the fNIRS signals, which were collected from the subjects during three mentalstates, including substractions, word generation, and rest. �e employed MLP model achieved theaccuracy of 66.48% based on the hand-cra�ed features (e.g., the concentration of OxyHb/DeoxyHb).A�er that, the authors study the mobile robot control through fNIRS signals and got the binaryclassi�cation accuracy of 82% (o�ine) and 66% (online) [72]. Chiarelli et al. [32] exploited thecombination of fNIRS and EEG for le�/right MI EEG classi�cation. Sixteen features extracted fromfNIRS signals (eight from OxyHb and eight from DeoxyHb) were fed into an MLP classi�er withfour hidden layers.

On the other hand, Hiroyasu et al. [64] a�empted to detect the gender of the subject throughtheir fNIRS signals. �e authors employed a denoising D-AE with three hidden layers to extractdistinctive features to be fed into an MLP classi�er for gender detection. �e model was evaluatedover a local dataset and gained the average accuracy of 81%. In this study, the authors also pointedout that, compared with PET and fMRI, fNIRS has higher time resolution and more a�ordable [64].5.4 fMRIRecently, several deep learning methods have been applied to fMRI analysis, especially on thediagnosis of cognitive impairment [191, 200].

(1) Discriminative models. Among the discriminative models, CNN is a promising model toanalysis fMRI [161]. For example, Havaei et al. built a segmentation approach for brain tumorbased on fMRI with a novel CNN algorithm which can capture both the global features and thelocal features simultaneously [61]. �e convolutional �lters have di�erent size. �us, the small-sizeand large-size �lter could exploit the local and global features, independently. Sarrraf et al. applieddeep CNN to diagnose Alzheimer’s Disease based on fMRI data for three-class recognition (normal,edema, or active tumor) [151]. Morenolopez et al. [117] employed a CNN model to deal with fMRIof brain tumor patients. �e model was evaluated over BRATS dataset and obtained the F1 score of88%. Hosseini et al. employed CNN for feature extraction SVM for detecting epileptic seizure [66].

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 15: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

A Survey on Deep Learning based Brain-Computer Interface 1:15

Furthermore, Li et al. proposed a data completion method based on CNN. In particular, utilizingthe information from fMRI data to complete positron emission tomography (PET), then train theclassi�er based on both fMRI and PET [95]. In the model, the input data of the proposed CNN isthe fMRI patch, and the output is a PET patch. �ere are two convolutional layers with ten �lterseach to mapping the input to output. �e experiments illustrated that the classi�er trained bythe combination of fMRI and PET (92.87%) outperformed the one trained by solo fMRI (91.92%)Moreover, Koyamada et al. used a nonlinear MLP to extract common features from di�erent subjects.�e model is evaluated over a dataset from the Human Connectome Project (HCP) [82].

(2) Representative models. A wide range of publications demonstrated the e�ectiveness ofrepresentative models in recognition of fMRI data [173]. Hu et al. [68] used demonstrated that deeplearning outperforms other machine learning methods in the diagnosis of neurological disorderssuch as Alzheimer’s disease. Firstly, the fMRI images were converted to a matrix to represent theactivity of 90 brain regions. Secondly, a correlation matrix is obtained by calculating the correlationbetween each pair of brain regions to represent the functional connectivity between di�erent brainregions. Furthermore, a targeted AE is built to classify the correlation matrix, which is sensitive toAD. �e proposed approach achieved an accuracy of 87.5%. Plis et al. [138] employed a DBN-RBMwith three RBM components to extract the distinctive features from ICA processed fMRI and �nallyachieved an average F1 measure of above 90% over four public datasets. Suk et al. compared thee�ectiveness of DBN-RBM and DBN-AE on Alzheimer’s disease detection and the experimentalresults showed that the former obtained the accuracy of 95.4%, which is slightly lower than the la�er(97.9%) [174]. Suk et al. [175] applied a D-AE model to extract latent features from the resting-statefMRI data on the diagnosis of Mild Cognitive Impairment (MCI). �e latent features are fed intoa SVM classi�er which achieved the accuracy of 72.58%. Ortiz et al. [130] proposed a multi-viewDBN-RBM to receives the information of MRI and PET simultaneously. �e learned representationswere sent to several simple SVM classi�ers which were ensembled to form a high-level strongerclassi�er by voting.

(3) Generative models. �e reconstruction of natural image based on fMRI data has beena�racted lots of a�ention [58, 160, 218]. Seeliger et al. [154] proposed a deep convolutional GANfor reconstructing visual stimuli from fMRI, which aimed at training a generator to create animage similar to the visual stimuli. �e generator contains four convolutional layers in order toconvert the input fMRI to a natural image. Han et al. [58] focused on the generation of syntheticmulti-sequence fMRI using GAN. �e generated image can be used for data augmentation for be�erdiagnostic accuracy or physician training to help be�er understand various diseases. �e authorsapplied the existing Deep Convolutional GAN (DCGAN) [140] and Wasserstein GAN (WGAN)[13] and found that the former works be�er. Shen et al. [160] presented another image recoveryapproach by minimizing the distance between the real image and the image generated based onreal fMRI.

5.5 EOGIn most situations, the EOG signals were regarded as artifacts which should be removed from thecollected EEG. However, it also could be used as the informative signals to deployment EOG basedBCI. Although there were a few researchers paid a�ention to EOG analysis, only limited papersutilized deep learning. For example, Xia et al. [202] a�empted to detect the subjects’ sleep stageonly using EOG signals. �ey employed a DBN-RBM for feature representation and a HMM forclassi�cation. Moreover, EOG has been widely used as supplementation of other signals (e.g., EEG)in several research topics such as emotion detection [78, 103, 196], sleep stage recognition [43, 176],and driving fatigue detection [40].

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 16: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1:16 Xiang Zhang, et al.

5.6 MEGGarg et al. [48] worked on the re�ning of MEG signals by removing the artifacts like eye-blinksand cardiac activity. �e MEG singles were decomposed by ICA �rst and then classi�ed by a 1-DCNN model. At last, the proposed approach achieved the sensitivity of 85% and speci�city of 97%over a local dataset. Hasasneh et al. [60] also focused on artifacts detection (cardiac and ocularartifacts). �e proposed approach uses CNN to capture temporal features and MLP to extractspatial information. Shu et al. [162] employed a sparse AE to learn the latent dependencies of MEGsignals in the task of single word decoding. �e results demonstrated that the proposed approachis advantageous for some subjects, although it did not produce an overall increase in decodingaccuracy. Cichy et al. [34] applied a CNN model to recognize visual object based on MEG and fMRIsignals.

6 DISCUSSIONIn this section, we �rst analyze what is the most suitable deep learning models for BCI signals.�en, we will summarize the popular deep learning models in BCI research. We hope this surveycould help our readers to select the most e�ective and e�cient methods when dealing with BCIsignals. In Table 4, we summarize the BCI signals and the corresponding deep learning modelsof the state-of-the-art papers. �e hybrid models are divided into three parts: the combination ofRNN and CNN, the combination of representative and discriminative models (denoted as ‘Repre +Discri’), and others.

6.1 BCI SignalsOur investigations above reveal that studies on non-invasive signals dominate the BCI research.Among the summarized 238 publications, there are only seven focused on invasive BCI, and mostof them worked on ECoG instead of intracortical signals. One important reason result in thisphenomena is that the invasive BCI has a higher requirement on the hardware and experimentenvironments. For example, the collection of ECoG signals needs a volunteer patient and a surgeonwho can operate craniotomy, which makes most researchers unquali�ed. Moreover, there are fewpublic datasets of invasive brain signals. �erefore, most people can not access the invasive data.In terms of the classi�cation of invasive signals, CNN-related algorithms o�en have higher abilityto recognize the pulse within cortex neurons.

Besides, among the non-invasive signals, the studies on EEG is far more than the sum of allthe other BCI paradigms (fNIRS, fMRI, EOG, and MEG). Furthermore, there are about 70% of theEEG papers pay a�ention to the spontaneous EEG (133 publications). For be�er understanding, wesplit the spontaneous EEG into several aspects: the sleeping, the motor imagery, the emotional,the mental disease, the data augmentation, and others. First, the classi�cation of the sleepingEEG mainly depends on the discriminative and the hybrid models. Among the nineteen studiesabout sleeping stage classi�cation, there are six employed CNN and the modi�ed CNN modelsindependently while two papers adopted RNN models. �ere are three hybrid models built on thecombination of CNN and RNN. Second, in terms of the research on MI EEG (30 publications), theindependent CNN and CNN based hybrid models are widely used. As for the representative models,DBN-RBM is o�en applied to capture the latent features from the MI EEG signals. �ird, there aretwenty-�ve publications related to spontaneous emotional EEG. More than half of them employedrepresentative models (such as D-AE, D-RBM, especially DBN-RBM) for unsupervised featurelearning. Most a�ective state recognition works recognize the user’s emotion as positive, neutral,or negative. Some researchers take a further step to classify the valence, and the arouse rate, whichis more complex and challenging. Fourth, the research on mental disease diagnosis is promising

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 17: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

Table 4. A summary of BCI studies based on deep learning models

BCI Signals

Deep Learning ModelsDiscriminative Models Representative Models Generative Models Hybrid Models

MLP RNN CNN AE (D-AE) RBM (D-RBM) DBN VAE GAN LSTM+CNN Repre + Discri OthersDBN-AE DBN-RBM

Invasive [80] [10, 67] [11, 67][203]

Non-invasiveSignals

EEG Spont-aneousEEG

Sleeping EEG [23, 156] [23, 185][192],[31],[166, 184],[23, 43]

[180, 216] [176][23] [44] [113],

[39]

MI EEG [32],[172] [225],[219, 220]

[87], [188],[129],[74],[86],[59, 207][181]

[90, 142][221] [89] [84, 144],

[107] [35] [179, 223] [177],[41][128, 226],[147, 197][224]

EmotionalEEG [45] [218]

[91],[104],[194],[92, 196]

[30],[103]

[230, 231][73] [204]

[73],[204],[96],[205],[93, 229]

[121] [47, 78][212] [7]

Mental DiseaseEEG [215] [178],[145]

[189],[3],[4],[122],[152],[66],[9, 75][67]

[214],[97],[123],[201]

[132] [187, 228] [155] [65, 67],[6, 49]

DataAugmentation

[1, 35],[133][76]

Others [141, 182][209] [159]

[22], [70],[16],[169],[159],[193],[63],[8, 139][56, 56][33, 168]

[211],[40] [125] [55],[125],[94, 149] [56, 56] [171],[19, 29] [222]

EP ERPVEP [77, 81], [76, 167]

[167],[86][56, 101],[56, 150][25]

[46] [102, 108],[150]

[109],[20],[21]

[158, 231][230]

RSVP [114, 115],

[111],[28],[165],[57],[112, 116],[98, 115][51, 213][26, 157]

[190] [112, 115] [26]

AEP [24, 150],[25, 170] [150]

SSEP SSVEP [53] [183][85], [199],[15, 183][186]

[83] [83] [14] [136]

ERD/ERS [32] [35, 177] [147]

fNIRS[127],[71],[72],[32],[62]

[71] [64]

fMRI [82, 160]

[34],[74],[61],[161],[66],[151],[95, 186]

[175] [174], [138],[174],[130, 173] [58, 154],[160, 217] [68]

EOG [168, 196][43] [40, 103] [202]

MEG [48],[34] [162] [60]

Page 18: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1:18 Xiang Zhang, et al.

and a�racting. �e majority of the related research focuses on the detection of epileptic seizureand Alzheimer’s Disease. Since the detection is a binary classi�cation problem, many studies canachieve a high accuracy like above 90%. In this area, the standard CNN model and the D-AE areprevalent. One possible reason is that CNN and AE are the most well-known and e�ective deeplearning models for classi�cation and dimensionality reduction. Fi�h, several publications paya�ention to the GAN based data augmentation. At last, about thirty studies are investigating otherspontaneous EEG such as driving fatigue, audio/visual stimuli impact, cognitive/mental load, andeye state detection. �ese studies extensively apply standard CNN models and variants.

Moreover, apart from spontaneous EEG, evoked potentials also a�racted much a�ention. Onthe one hand, in ERP, VEP and the subcategory RSVP has drawn lots of investigations becausevisual stimuli, compared to other stimuli, is easier to be conducted and more applicable in thereal world (e.g., P300 speller can be used for brain typing). For VEP (21 publications), there are 11studies applied discriminative models, and six works adopted hybrid models. In terms of RSVP,the sole CNN dominates the algorithms. Apart from them, �ve papers focused on the analysisof AEP signals. On the other hand, among the steady-state related researches, only SSVEP hasbeen studied by deep learning models. Most of them only applied discriminative models on therecognition of the target image. At last, few papers a�empted to investigate the ERD/ERS singles.Several publications utilized ERD/ERS to analyze the signals and calculated the ERD/ERS value as adistinct feature.

Furthermore, beyond the diverse EEG diagrams, a wide range of papers paid a�ention to fNIRSand fMRI. �e fNIRS images are rarely studied by deep learning, and the major studies just employedthe simple MLP models. We believe more a�ention should be paid to the research on fNIRS for thehigh portability and low cost. As for the fMRI, twenty-three papers proposed deep learning modelsto the classi�cation. �e CNN model is widely used for its outstanding performance in featurelearning from images. �ere are also several papers interested in image reconstruction based onfMRI signals. One reason why fMRI is so hot is that several public datasets are available on theInternet, although the fMRI equipment is expensive. Nowadays, the EOG was mainly regardedas noise instead of useful signals. However, it enables individuals to communicate with the outerworld by refecting of the user’s eye movement. �e MEG signals are mainly used in the medicalarea, which is insensitive to the deep learning algorithm. �us, we only found very few studies onMEG. �e sparse AE and CNN algorithms have a positive in�uence on the feature re�ning andclassi�cation of MEG.

6.2 Deep Learning ModelsOur investigation shows that discriminative models appear the most in the summarized 238publications—this is reasonable at a high level because almost all the BCI issues can be regarded as aclassi�cation problem. Another observation is that CNN takes more than 70% of the discriminativemodels, for which we provide reasons as follows. First, the design of CNN is powerful enoughto extract the latent discriminative features and spatial dependencies from the EEG signals forclassi�cation. As a result, CNN structures are adopted for classi�cation in some studies whileadopted for feature engineering in some other studies. Second, CNN has been achieved great successin some research areas (e.g., computer version), which makes it extremely famous and feasible(through the available public code). �us, the BCI researchers have more chance to understandand apply CNN on their works. �ird, some BCI diagrams (e.g., fMRI, MEG) naturally form two-dimension images conducive to processing by CNN. Meanwhile, other 1-D signals (e.g., EEG) couldbe converted into 2-D images for further analysis by CNN. Here, we provide several methods

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 19: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

A Survey on Deep Learning based Brain-Computer Interface 1:19

converting 1-D EEG signals (with multiple channels) to the 2-D matrix: 1) convert each time-point4

to a 2-D image; 2) convert a segment into a 2-D matrix. For the �rst situation, suppose we have32 channels, and we can collect 32 elements (each element corresponding to a channel) at eachtime-point. As described in [91], the collected 32 elements could be converted into a 2-D imagebased on the spatial position. For the second situation, suppose we have 32 channels, and thesegment contains 100 time-points. �e collected data can be arranged as a matrix with the shapeof [32, 100] where each row and column refers to a speci�c channel and time-point, respectively.Fourth, there are a lot of variants of CNN which are suitable for a wide range of BCI scenarios.For example, the single-channel EEG signals can be processed by 1-D CNN. In terms of RNN, onlyabout 20% of discriminative model based papers adopted RNN, which is much less than we expectedsince RNN has demonstrated powerful in temporal feature learning. One possible reason for thisphenomena is that processing a long sequence by RNN is time-consuming and the EEG signalsgenerally form a long sequence. For example, the sleeping signals are usually sliced into segmentswith 30 seconds, which has 3000 time-points under 100 Hz sampling rate. For a sequence with 3000elements, through our preliminary experiments, RNN takes more than 20 folds training time thanCNN. Moreover, MLP is not popular due to its inferior e�ectiveness (e.g., non-linear ability) to theother algorithms its simple deep learning architecture.

As for representative models, DBN, especially DBN-RBM, is the most popular model for featureextraction. DBN is widely used in BCI for two reasons: 1) it learns the generative parameters thatreveal the relationship of variables in neighboring layers e�ciently; 2) it makes it straightforwardto calculate the values of latent variables in each hidden layer [36]. However, most work thatemploys the DBN-RBM model were published before 2016, indicating DBN is not popular. It canbe inferred that the researchers prefer to use DBN for feature learning followed by a non-deeplearning classi�er before 2016; but recently, an increasing number of studies would like to adoptCNN or hybrid models for both feature learning and classi�cation.

Moreover, generative models are rarely employed independently. �e GAN and VAE baseddata augmentation and image reconstruction are mainly focused on fMRI and EEG signals. Itis demonstrated that the trained classi�er will achieve more competitive performance a�er dataaugmentation. �erefore, this is a promising research prospect in the future.

Last but not then least, there are ��y-three publications proposed hybrid models for BCI studies.Among them, the combinations of RNN and CNN take about one-��h proportion. Since RNN andCNN are illustrated having excellent temporal and spatial feature extraction ability, it is natural tocombine them for both temporal and spatial feature learning. Another type of hybrid models isthe combination of representative and discriminative models. �is is easy to understand becausethe former is employed for feature re�ning, and the la�er is employed for classi�cation. �ere aretwenty-eight publications which almost covered all the BCI signals proposed this type of hybriddeep learning models. �e adopted representative models are mostly AE or DBN-RBM; at themeanwhile, the adopted discriminative models are mostly CNN. Apart from that, there are twelvepapers proposed other hybrid models such as two discriminative models. For example, severalstudies proposed the combination of CNN and MLP where the CNN structure is used to extractspatial features which are fed into an MLP for classi�cation.

7 BCI APPLICATIONSDeep learning models have contributed to various of BCI applications as summarized in Table 5.�e papers focused on signal classi�cation without application background are not listed in thistable. �erefore, the publication amounts in this table are less than in Table 4.4Time-point represents one sampling point. For example, we can have 100 time-points if the sampling rate is 100 Hz.

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 20: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1:20 Xiang Zhang, et al.

Table 5. Summary of deep learning based BCI applications. The ‘local’ dataset refers to private or not availabledataset and the public datasets (with links) will be introduced in Section 7.9. In the signals, S-EEG, MD EEG,and E-EEG separately denote sleeping EEG, mental disease EEG, and emotional EEG. The single ‘EEG’ refersto the other subcategory of spontaneous EEG. In the models, RF and LR denote to random forest and logisticregression algorithms, respectively. In the performance column, ‘N/A’, ‘sen’, ‘spe’, ’aro’, ‘val’, ‘dom’, and ‘like’denote not-found, sensitivity, specificity, arousal, valence, dominance, and liking, respectively.

BCI Applications Reference Signals Deep LearningModels Dataset Performance

HealthCare

Sleeping�alityEvaluation

Vilamala et al. [192] S-EEG CNN Sleep-EDF 0.86Chambon et al. [31] S-EEG Multi-view CNN MASS session 3 N/AZhang et al. [216] S-EEG DBN + voting UCD 0.9131Tsinalis et al. [184] S-EEG CNN Sleep-EDF 0.82Sors et al. [166] S-EEG CNN SHHS 0.87Manzano et al. [113] S-EEG CNN + MLP Sleep-EDF 0.732

Shahin et al. [156] S-EEG MLPUniversityHospitalin Berlin

0.9

Supratak et al. [176] EEG, EOG CNN + LSTM MASS/Sleep-EDF 0.862/0.82

Xia et al. [202] EOG DBN-RBM+ HMM Sleep-EDF 0.833

Ru�ni et al. [145] S-EEG RNN Local 0.85Fraiwan et al. [44] S-EEG DBN-AE + MLP Local 0.804Tan et al. [180] S-EEG DBN-RBM Local 0.9278 (F1)Fernandez et al. [43] EEG, EOG CNN SHHS 0.9 (F1)Biswai et al. [23] S-EEG RNN Local 0.8576

ADDetection

Hu et al. [68] fMRI D-AE + MLP ADNI 0.875Morabito et al. [122] MD EEG CNN Local 0.82

Suk et al. [174] fMRI DBN-AE;DBN-RBM ADNI 0.979;

0.954Zhao et al. [228] MD EEG DBN-RBM Local 0.92Sarraf et al. [151] fMRI CNN ADNI 0.9685

Ortiz et al. [130] fMRI, PET DBN-RBM+ SVM ADNI 0.9

Li et al. [95] fMRI CNN + LR ADNI 0.9192

SeizureDetection

Tsiouris et al. [185] MD EEG LSTM CHB-MIT >0.99Yuan et al. [215] MD EEG A�ention-MLP CHB-MIT 0.9661Yuan et al. [214] MD EEG D-AE + SVM CHB-MIT 0.95Ullah et al. [189] MD EEG CNN + voting UBD 0.954Lin et al.[97] MD EEG D-AE UBD 0.96Hosseini et al. [65] MD EEG D-AE + MLP Local 0.94Page et al. [132] MD EEG DBN-AE + LR N/A 0.8 ∼ 0.9

Golmohammadi et al. [49] MD EEG RNN+CNN TUH Sen: 0.3083;Spe: 0.9686

Wen et al. [201] MD EEG AE Local 0.92Acharya et al. [4] MD EEG CNN UBD 0.8867Schirmeister et al. [152] MD EEG CNN TUH 0.854Hosseini et al. [66] MD EEG CNN Local N/ATalathi et al. [178] MD EEG GRU BUD 0.996Kiral et al. [80] EcoG MLP Local Sen: 0.69Johansen et al. [75] MD EEG CNN Local 0.947 (AUC)Ansari et al. [9] MD EEG CNN + RF Local 0.77Hosseini et al. [67] EEG, EcoG CNN Local 0.96

Shah et al. [155] MD EEG CNN+ LSTM TUH Sen: 0.39;Spe: 0.9037

Turner et al. [187] MD EEG DBN-RBM+ LR Local N/A

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 21: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

A Survey on Deep Learning based Brain-Computer Interface 1:21

Table 5. Summary of deep learning based BCI applications (Continued).

BCI Applications Reference Signals Deep LearningModels Dataset Performance

HealthCare

Others:

CardiacDetection

Garg [48] MEG CNN Local Sen: 0.85,Spe: 0.97

Hasasneh et al. [60] MEG CNN + MLP Local 0.944

Depression Acharya et al. [3] MD EEG CNN Local 0.935 ∼ 0.9596

Al et al. [6] MD EEG DBN-RBM+ MLP Local 0.695

Brain TumorMorenolopez et al. [117] fMRI CNN BRATS 0.88 (F1)Havaei et al. [61] fMRI Muli-scale CNN BRATS 0.88 (F1)Shreyas et al. [161] fMRI CNNC BRATS 0.83

Interictal EpilepticDischarge (IED)

Antoniades et al. [10] EcoG CNN Local 0.8751Antoniades et al. [11] EEG, EcoG AE + CNN Local 0.68

Schizophrenia Plils et al. [138] fMRI DBN-RBM Combined ¿0.9 (F1)

Chu et al. [33] CNN + RF+ Voting Local 0.816, 0.967, 0.992

Creutzfeldt-JakobDisease (CJD) Morabito et al. [123] MD EEG D-AE Local 0.81 ∼ 0.83

Mild CognitiveImpairment (MCI) Suk et al. [175] fMRI AE + SVM ADNI2 0.7258

SmartEnvironment

RobotControl

Behncke et al. [22] EEG CNN Local 0.75Huve et al. [72] fNIRS MLP Local 0.82

ExoskeletonControl Kwak et al. [85] SSVEP CNN Local 0.9403

SmartHome Zhang et al. [220] MI EEG RNN EEGMMI 0.9553

Communication

Kawasaki et al. [77] VEP MLP Local 0.908

Ceco�i et al. [25] VEP CNN + Voting�e third BCIcompetition,Dataset II

0.955

Zhang et al. [223] MI EEG LSTM+CNN+AE Local 0.9452

Ceco�i et al. [25] VEP CNN�e third BCIcompetition,Dataset II

0.945

Maddula et al. [109] VEP RCNN Local 0.65∼0.76

Liu et al. [101] VEP CNN�e third BCIcompetition,Dataset II

0.92 ∼ 0.96

Security Identi�cationZhang et al. [225] MI-EEG A�ention-based

RNN EEGMMI + local 0.9882

Koike et al. [81] VEP MLP Local 0.976Mao et al. [116] RSVP CNN Local 0.97

Authentication Zhang et al. [219] MI EEG Hybrid EEGMMI + local 0.984

A�ective Computing

Mioranda et al. [121] E-EEG RNN + CNN AMIGOS ¡0.7

Jia et al. [73] E-EEG DBN-RBM DEAP 0.8 ∼0.85 (AUC)

Li et al. [91] E-EEG HierarchicalCNN SEED 0.882

Xu et al. [204] E-EEG DBN-AE,DBN-RBM DEAP >0.86 (F1)

Liu et al. [104] E-EEG CNN Local 0.82Frydenlund et al. [45] E-EEG MLP DEAP N/A

Yin et al. [212] E-EEG Multi-view D-AE+ MLP DEAP Aro: 0.7719;

Val: 0.7617Chai et al. [30] E-EEG AE SEED 0.818

Kawde et al. [78] EEG, EOG DBN-RBM DEAPAro: 0.7033;Val: 0.7828;Dom: 0.7016

Li et al. [96] E-EEG DBN-RBM DEAPAro:0.642,Val:0.584,Dom 0.658

Xu et al. [205] E-EEG DBN-RBM DEAPAro:0.6984,Val:0.6688,Lik: 0.7539

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 22: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1:22 Xiang Zhang, et al.

Table 5. Summary of deep learning based BCI applications (Continued).

BCI Applications Reference Signals Deep Learning Models Dataset Performance

A�ective Computing

Zheng et al. [229] E-EEG DBN-RBM+ HMM Local 0.8762

Alhagry et al. [7] E-EEG LSTM + MLP DEAPAro:0.8565,Val:0.8545,Lik: 0.8799

Li et al. [64] E-EEG CNN SEED 0.882

Zhang et al. [230, 231] E-EEG DBN-RBM+ MLP SEED 0.8608

Liu et al. [103] EEG,EOG AE SEED,

DEAP 0.9101, 0.8325

Gao et al. [47] E-EEG DBN-RBM+ MLP Local 0.684

Zhang et al. [218] E-EEG RNN SEED 0.895

Drive Fatigue Detection

Hung et al. [70, 70] EEG CNN Local 0.572 (RMSE)Hajinoroozi et al. [55] EEG DBN-RBM Local 0.85Hung et al. [70] EEG CNN Local

Du et al. [40] EEG,EOG D-AE + SVM Local 0.094 (RMSE)

San et al. [149] EEG DBN-RBM + SVM Local 0.7392Almogbel et al. [8] EEG CNN Local 0.9531Hachem et al. [53] SSVEP MLP Local 0.75Chai et al. [29] EEG DBN + MLP Local 0.931Hajinoroozi et al. [56, 56] EEG CNN Local 0.8294

Mental Load Measurement

Naseer et al. [127] fNIRS MLP Local 0.963Yin et al. [211] EEG D-AE Local 0.9584Hennrich et al. [62] fNIRS MLP Local 0.641Bashivan et al. [20] EEG R-CNN Local 0.9111Bashivan et al. [21] EEG DBN + MLP Local N/ABashivan et al. [19] EEG DBN-RBM Local 0.92Li et al. [94] EEG DBN-RBM Local 0.9886

OtherAppli--cations

School Bullying Baltatzis et al. [16] EEG CNN Local 0.937

Music Detection

Stober et al. [169] EEG CNN Local 0.776Stober et al. [171] EEG AE + CNN Open MIIR 0.27 for 12-classStober et al. [170] EEG CNN Local 0.244

Sternin et al. [168] EEG,EOG CNN Local 0.75

NumberChoosing Waytowich et al. [199] SSVEP CNN Local 0.8

Visual ObjectRecognition

Cichy et al. [34] fMRI,MEG CNN N/A N/A

Manor et al. [111] RSVP CNN Local 0.75Ceco�i et al. [28] RSVP CNN Local 0.897 (AUC)Hajinoroozi et al. [57] RSVP CNN Local 0.7242 (AUC)Perez et al. [136] SSVEP AE Local 0.9778Shamwell et al. [157] RSVP CNN Local 0.7252 (AUC)

FingerTrajector Xie et al. [203] EcoG RNN+CNN BCI

Competition IV N/A

GuiltyKnowledgeTest

Kulasingham et al. [83] SSVEP DBN-RBM;DBN-AE Local 0.869;

0.8601

ConcealedInformationTest

Liu et al. [102] EEG DBN-RBM Local 0.973

Flanker Task Volker et al. [193] EEG CNN Local 0.841

Eye State Narejo et al. [125] EEG DBN-RBM UCI 0.989Reddy et al. [141] EEG MLP Local 0.975

User Preference Teo et al. [182] EEG MLP Local 0.6399EmergencyBraking Hernandez et al. [63] EEG CNN Local 0.718

GenderDetection

Hiroyasu et al. [64] fNIRS D-AE + MLP Local 0.81Pu�en et al. [139] EEG CNN Local 0.81

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 23: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

A Survey on Deep Learning based Brain-Computer Interface 1:23

7.1 Health CareIn the health care area, the deep learning based BCI systems mainly works on the detection anddiagnosis of mental diseases such as sleeping disorders, Alzheimer’s Disease, epileptic seizure, andother disorders. In the �rst place, for the sleeping disorder detection, most studies are focused onthe sleeping stage detection based on sleeping spontaneous EEG. In this situation, the researchersdo not need to recruit patients with sleeping disorder because the sleeping EEG signals can beeasily collected from healthy individuals. In terms of the algorithm, it can be observed from Table 5that the DBN-RBM and CNN are widely adopted for feature engineering and classi�cation. Ru�niet al. [145] walk one step further by detecting the REM Behavior Disorder (RBD), which may causeneurodegenerative diseases such as Parkinson’s disease. �ey achieved an average accuracy of 85%in recognition of the RBD from healthy controls.

Moreover, fMRI is widely applied in the diagnosis of Alzheimer’s Disease. By taking advantageof the high spatial resolution of fMRI, the diagnosis achieved the accuracy of above 90% in severalstudies. Another reason that contributes to competitive performance is the binary classi�cationscenario. Apart from that, several publications diagnose the AD based on spontaneous EEG[122, 228].

Besides, the diagnosis of epileptic seizure a�racted much a�ention. �e seizure detection mainlybased on spontaneous EEG and a few ECoG signals. �e popular deep learning models in thisscenario contain the independent CNN and RNN, along with hybrid models combined RNN and CNN.Some models integrated the deep learning models for feature extraction and traditional classi�erfor detection [132, 187]. For example, Yuan et al. [214] applied a D-AE in feature engineeringfollowed by SVM for seizure diagnosis. Ullah et al. [189] adopted the voting for post-processing,which proposed several di�erent CNN classi�ers and predicted the �nal result by voting.

Furthermore, there are a lot of other healthcare issues can be solved by BCI systems. �e cardiacartifacts in MEG can be automatically detected by deep learning models[48, 60]. Several modi�edCNN structures are proposed to detect brain tumor based on fMRI from the public BRATS dataset[61, 161]. Researchers have demonstrated the e�ectiveness of deep learning models in the detectionof a wide number of mental diseases such as depression [3], Interictal Epileptic Discharge (IED)[10], schizophrenia [138], Creutzfeldt-Jakob Disease (CJD) [123], and Mild Cognitive Impairment(MCI) [175].

7.2 Smart Environment�e smart environment is a promising application scenario for BCI in the future. With the develop-ment of Internet of �ings (IoT), an increasing number smart environment can be connected toBCI. For example, the assisting robot can be used in smart home [220, 224], in which the robot canbe controlled by brain signals of the individuals. Moreover, Behncke et al. [22] and Huve et al. [72]investigated the robot control problem based on the visual stimulated spontaneous EEG and fNIRSsignals. �e BCI controlled exoskeleton could help the disabilities who damaged the motor systemin sub-limb in walking and daily activities [85]. In the future, the research on brain-controlledappliances may be bene�cial to the elders or disabilities in smart home and smart hospital.

7.3 CommunicationOne of the biggest advantages of BCI, compared to other human-machine interface techniques,is that BCI enables the patient who lost most motor abilities like speaking to communicate withthe outer world. �e deep learning technology improved the e�ciency of brain signal basedcommunications. One typical diagram which enables individual typing without any motor systemis P300 speller, which can convert the user’s intent into text [77]. �e powerful deep learning

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 24: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1:24 Xiang Zhang, et al.

models empower the BCI systems to recognize the P300 segment from the non-P300 segmentwhile the former contains the communication information of the user [25]. In a higher level, therepresentative deep learning models can help to detect what character the user is focusing on andprint it on the screen to chat with others [25, 101, 109]. Additionally, Zhang et al. [223] proposed ahybrid model combined RNN, CNN, and AE to extract the informative features from MI EEG torecognize what le�er the user wants to speak.

7.4 SecurityBCI can be used for security problems such as identi�cation (or recognition) and authentication (orveri�cation). �e former conducts multi-class classi�cation to recognize a person’s identity [225].�e la�er conducts binary classi�cation to decide whether a person is authorized [219].

�e majority of the existing biometric identi�cation/authentication systems rely on individuals’intrinsic physiological features such as face, iris, retina, voice, and �ngerprint [225]. �ey arevulnerable to various a�acks based on anti-surveillance prosthetic masks, contact lenses, vocoder,and �ngerprint �lms. EEG-based biometric person identi�cation is a promising alternative givenits highly resilient to spoo�ng a�acks—individual�s EEG signals are virtually impossible for animposter to mimic. Koike et al. [81] have adopted deep neural networks to identify the user’s IDbased on the VEP signals; Mao et al. [116] applied CNN for person identi�cation based on RSVPsignals; Zhang et al. [225] proposed an a�ention-based LSTM model and evaluated it over bothpublic and local datasets. EEG signals are also combined with gait information in a hybrid deeplearning model for a dual-authentication system [219].

7.5 A�ective ComputingA�ective states of a user provide critical information for many applications such as personalizedinformation (e.g., multimedia content) retrieval or intelligent human-computer interface design[204]. Recent research illustrated that deep learning models can enhance the performance ina�ective computing. �e most widely used circumplex model believe the emotions are distributedin two dimensions: arousal and valence. �e arousal refers to the intensity of the emotionalstimuli or how strong is the emotion. �e valence refers to the relationship within the person whoexperiences the emotion. In some other models, the dominance and liking dimensions are deployed.

Some research [91, 104, 194] a�empts to classify users’ emotional state into two (positive/negative)or three categories (positive, neutral, and negative) based on EEG signals using deep learningalgorithms such as CNN and its variants [45]. DBN-RBM is the most representative deep learningmodel to discover the concealed features from emotional spontaneous EEG [204, 230]. Xu et al.[204] applied DBN-RBM as feature extractors to classify a�ective states based on EEG.

Further, some researchers aim to recognize the positive/negative state of each speci�c emotionaldimension. For example, Yin et al. [212] employed an ensemble classi�er of AE in order to recognizethe user’s a�ection. Each AE uses three hidden layers to �lter out noises and to derive stablephysiological feature representations. �e proposed model was evaluated over the benchmark,DEAP, and achieved the arousal of 77.19% and valence of 76.17%.

7.6 Driver Fatigue DetectionVehicle drivers’ ability to keep alert and maintain optimal performance will dramatically a�ectthe tra�c safety [8]. EEG signals have proven useful in evaluating the human’s cognitive state indi�erent context. Generally, a driver is regarded as in an alert state if the reaction time is lower than0.7 seconds and in fatigue state if it is higher than 2.1 seconds. Hajinoroozi et al. [55] considered the

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 25: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

A Survey on Deep Learning based Brain-Computer Interface 1:25

detection of driver’s fatigue from EEG signals by discovering the distinct features. �ey exploredan approach based on DBN for dimension reduction.

Detecting driver fatigue is crucial because the drowsiness of the driver may lead to disaster.Driver fatigue detection is feasible in practice. In the hardware aspect, the collection equipment ofEEG singles is o�-the-shelf and portable enough to be used in a car. Moreover, the price of an EEGheadset is a�ordable for most people. In the algorithm aspect, deep learning models have enhancedthe performance of fatigue detection. As we summarized, the EEG based driving drowsiness can berecognized with high accuracy (82% ∼ 95%).

Future scope of drive fatigue detection is in the self-driving scenario. As we know, in the mostsituation of self-driving (e.g., Automation level 35), the human driver is expected to respond appro-priately to a request to intervene, which indicates that the driver should keep alert state. �erefore,we believe the application of BCI based drive fatigue detection will bene�t the development of theself-driving car.

7.7 Mental Load Measurement�e EEG oscillations can be used to measure the mental workload level, which can sustain decisionmaking and strategy development in the context of human-machine interaction [211]. For example,the abnormal mental workload of the human operator may result in performance degradationwhich could cause catastrophic accidents [135].

Several researchers have been paid a�ention to this topic. �e mental workload can be measuredfrom fNIRS signals or spontaneous EEG. Naseer et al. adopted a MLP algorithm for fNIRS-basedbinary mental task level classi�cation (mental arithmetic and rest) [127]. �e experiment resultsshowed that the MLP outperformed the traditional classi�ers like SVM, kNN, and achieved thehighest accuracy of 96.3%. Bashivan et al. [19] presented a statistical approach, a DBN model, forthe recognition of mental workload level based on single-trial EEG. Before the DBN, the authorsmanually extracted the wavelet entropy and band-speci�c power from three frequency bands(theta, alpha, and beta). At last, the experiments demonstrated the recognition of mental workloadachieved an overall accuracy of 92%.

7.8 Other Applications�ere are plenty of interesting scenarios beyond the above where deep learning based BCI canapply, such as recommender system [182] and emergency braking [63]. One possible topic is therecognition of a visual object, which may be used in guilty knowledge test [83] and concealedinformation test [102]. �e neurons of the participant will produce a pulse when he/she suddenlywatch a similar object. Based on the theory, the visual target recognition is mainly used RSVPsignals. Ceco�i et al. [28] aimed to build a common model for target recognition, which can workfor various subjects instead of a speci�c subject.

Besides, researchers have investigated to distinguish the subject’s gender by the fNIRS [64] andspontaneous EEG [139]. Hiriyasu et al. [64] adopted deep learning to recognize the gender ofthe subject based on the cerebral blood �ow. �e experiment results suggested that the cerebralblood �ow changes in di�erent ways for male and female. Pu�en et al. [139] tried to discoverthe sex-speci�c information from the brain rhythms and adopted a CNN model to recognize theparticipant’s gender. �is paper illustrated that fast beta activity (20 ∼25 Hz) is one of the mostdistinctive a�ributes.

5h�ps://en.wikipedia.org/wiki/Self-driving car

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 26: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1:26 Xiang Zhang, et al.

Table 6. The summary of public dataset for BCI systems. ‘# Sub’, ‘# Cla’, andS-Rate denote the number ofsubject, number of class, and sampling rate, respectively. FM denote finger movement while BCI-C denotethe BCI Competition.

BCI Signals Name Link # Sub # Cla S-Rate # ChannelInva-sive

FM EcoG BCI-C IV6, Dataset IV 3 5 1000 48 ∼ 64

MI EcoG BCI-C III7,Dataset I 1 2 1000 64

EEG

SleepingEEG

Sleep-EDF8: Telemetry 22 6 100 2 EEG, 1 EOG, 1 EMG

Sleep-EDF: Casse�e 78 6 100, 1 2 EEG (100Hz), 1 EOG (100Hz),1 EMG (1Hz)

MASS-19 53 5 256 17/19 EEG, 2 EOG, 5 EMGMASS-2 19 6 256 19 EEG, 4 EOG, 1EMGMASS-3 62 5 256 20 EEG, 2 EOG, 3 EMGMASS-4 40 6 256 4 EEG, 4 EOG, 1 EMGMASS-5 26 6 256 20 EEG, 2 EOG, 3 EMG

SHHS10 5804 N/A 125, 50 2 EEG (125Hz), 1EOG (50Hz),1 EMG (125Hz)

SeizureEEG

CHB-MIT11 22 2 256 18TUH12 315 2 200 19

MIEEG

EEGMMI13 109 4 160 64BCI-C II14, Dataset III 1 2 128 3BCI-C III, Dataset III a 3 4 250 60BCI-C III, Dataset III b 3 2 125 2BCI-C III, Dataset IV a 5 2 1000 118BCI-C III, Dataset IV b 1 2 1001 119BCI-C III, Dataset IV c 1 2 1002 120BCI-C IV, Dataset I 7 2 1000 64BCI-C IV, Dataset II a 9 4 250 22 EEG, 3 EOGBCI-C IV, Dataset II b 9 2 250 3 EEG, 3 EOG

EmotionalEEG

AMIGOS15 40 4 128 14SEED16 15 3 200 62DEAP17 32 4 512 32

OthersEEG Open MIIR18 10 12 512 64

VEP BCI-C II, Dataset II b 1 36 240 64BCI-C III, Dataset II 2 26 240 64

fMRI ADNI19 202 3 N/A N/ABRATS20 2013 65 4 N/A N/A

MEG BCI-C IV, Dataset III 2 4 400 10

7.9 Benchmark DatasetsWe have extensively explored the benchmark datasets usable for deep learning based BCI (Table 6).We provide 31 public datasets with download links, which cover most BCI signal types. In particular,BCI competition IV (BCI-C IV) contains �ve datasets via the same link. For be�er understanding,we present the number of subjects, the number of class (how many categories), sampling rate, andthe number of channels of each dataset. In the ‘# Channel’ column, the default channel is for EEGsignals. Some datasets contain more biometric signals (e.g., ECG), but we only list the channelsrelated to BCI.

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 27: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

A Survey on Deep Learning based Brain-Computer Interface 1:27

8 FUTURE DIRECTIONSAlthough deep learning has li�ed the performance of BCI systems, technical and usability challengesremain. �e technical challenges concern the classi�cation ability in complex BCI scenarios, andthe usability challenges refer to limitations in large scale real-world deployment. In this section,we introduce these challenges and point out the possible solutions.

8.1 Explainable General FrameworkUntil now, we have introduced several types of BCI signals (e.g., spontaneous EEG, ERP, fMRI) anddeep learning models that have been applied for each type. One promising research direction fordeep learning based BCI is to develop a general framework that can handle various BCI signalsregardless of the number of channels used for signal collection, the sample dimensions (e.g., 1-D or2-D sample), and stimulation types (e.g., visual or audio stimuli), etc. �e general framework wouldrequire two key capabilities: the a�ention mechanism and the ability to capture latent feature. �eformer guarantees the framework can focus on the most valuable parts of input signals, and thela�er enables the framework to capture the distinctive and informative features.

�e a�ention mechanism can be implemented based on a�ention scores or by various machinelearning algorithms such as reinforcement learning. �e a�ention scores can be inferred fromthe input data and work as a weight to help the framework to pay a�ention to the parts withhigh a�ention scores. Reinforcement learning has shown to be able to �nd the most valuable partthrough a policy search [226]. CNN is the most suitable structure for capturing features at variouslevels and ranges. In the future, CNN could be used as a fundamental feature learning tool and beintegrated with suitable a�ention mechanisms to form a general classi�cation framework.

One additional direction we may consider is how to interpret the feature representation derivedby the deep neural network, what is the intrinsic relationship between the learned features andthe task-related neural pa�ern, or neuropathology of mental disorders. More and more people arerealizing that interpretation could be even more important than prediction performance, since weusually just treat deep learning as a black box.

8.2 Person-independent ClassificationUntil now, most BCI classi�cation tasks focus on person-dependent scenarios, where the trainingsamples and testing samples are collected from the identical individual. �e future direction is torealize person-independent classi�cation so that the testing data will never appear in the trainingset. High-performance person-independent classi�cation is compulsory for the wide application ofBCI Systems in the real world.

6 h�p://www.bbci.de/competition/iv/7h�p://www.bbci.de/competition/iii/8h�ps://physionet.org/physiobank/database/sleep-edfx/9h�ps://massdb.herokuapp.com/en/10h�ps://physionet.org/pn3/shhpsgdb/11h�ps://physionet.org/pn6/chbmit/12h�ps://www.isip.piconepress.com/projects/tuh eeg/html/downloads.shtml13h�ps://physionet.org/pn4/eegmmidb/14h�p://www.bbci.de/competition/ii/15h�p://www.eecs.qmul.ac.uk/mmv/datasets/amigos/readme.html16h�p://bcmi.sjtu.edu.cn/ seed/download.html17h�ps://www.eecs.qmul.ac.uk/mmv/datasets/deap/18h�ps://owenlab.uwo.ca/research/the openmiir dataset.html19h�p://adni.loni.usc.edu/data-samples/access-data/20h�ps://www.med.upenn.edu/sbia/brats2018/data.html

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 28: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1:28 Xiang Zhang, et al.

One possible solution to achieving this goal is to build a personalized model with transferlearning. A personalized a�ective model can adopt a transductive parameter transfer approachto construct individual classi�ers and to learn a regression function that maps the relationshipbetween data distribution and classi�er parameters [232]. Another potential solution is mining thesubject-independent component from the input data. �e input data can be decomposed into twoparts: a subject-dependent component, which depends on the subject and a subject-independentcomponent, which is common for all subjects. A hybrid multi-task model can work on two taskssimultaneously, one focusing on person identi�cation and the other on class recognition. A well-trained and converged model is supposed to extract the subject-independent features in the classrecognition task.

8.3 Semi-supervised and Unsupervised Classification�e performance of deep learning highly depends on the size of training data, which, however,requires expensive and time-consuming manual labeling to collect abundant class labels for a widerange of scenarios such as sleeping EEG. While supervised learning requires both observations andlabels for the training, unsupervised learning requires no labels, and semi-supervised learning onlyrequires partial labels [73]. �ey are, therefore, more suitable for problems with li�le ground truth.

Zhang et al. proposed an Adversarial Variational Embedding (AVAE) framework that combinesa VAE++ model (as a high-quality generative model) and semi-supervised GAN (as a posteriordistribution learner) [227] for robust and e�ective semi-supervised learning. Jia et al. proposed asemi-supervised framework by leveraging the data distribution of unlabelled data to prompt therepresentation learning of labelled data [73].

Two methods may enhance the unsupervised learning: one is to employ crowd-sourcing tolabel the unlabeled observations; the other is to leverage unsupervised domain adaption learningto align the distribution of source BCI signals and the distribution of target signals with a lineartransformation.

8.4 Hardware PortabilityPoor portability of hardware has been preventing BCI systems from wide application in the realworld. In most scenarios, users would like to use small, comfortable, or even wearable BCI hardwareto collect brain signals and to control appliances and assistant robots.

Currently, there are three types of EEG collection equipment: the unportable, the portable headset,and ear-EEG sensors. �e unportable equipment (e.g., BCI 2000) has high sampling frequency,channel numbers, and signal quality but is expensive. It is suitable for physical examination ina hospital. �e portable headsets (e.g., Neurosky, Emotiv EPOC) have 1 ∼ 14 channels and 128∼256 sampling rate but may discomfort people a�er long-time use. �e ear-EEG sensors, whichare a�ached to the outer eat, have gained increasing a�ention recently but remain mostly at thelaboratory stage [131]. �e ear-EEG sensors contain a series of electrodes which are placed in eachear canal and concha [119]. �e EEGrids, to the best of our knowledge, is the only commercialear-EEG. It has multi-channel sensor arrays placed around the ear using an adhesive 21 and is evenmore expensive. A promising future direction is to improve the usability by developing a cheaper(e.g., lower than 200$) and more comfortable (e.g., can last longer than 3 hours without feelinguncomfortable) wireless ear-EEG equipment.

21h�p://ceegrid.com/home/concept/

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 29: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

A Survey on Deep Learning based Brain-Computer Interface 1:29

9 CONCLUSIONIn this paper, we systematically survey the recent advances in deep learning models for Brain-Computer Interface. Compared with traditional methods, deep learning not only enables to learnhigh-level features automatically from BCI signals but also depends less on manual-cra�ed featuresand domain knowledge. We summarize BCI signals and dominant deep learning models, followedby discussing state-of-the-art deep learning techniques for BCI and identifying the suitable deeplearning algorithms for each BCI signal type. Finally, we overview deep learning based BCIapplications and point out the open challenges and future directions.

REFERENCES[1] S. Abdelfa�ah et al. 2018. Augmenting �e Size of EEG datasets Using Generative Adversarial Networks. In IJCNN.[2] S. Abdulkader et al. 2015. Brain computer interfacing: Applications and challenges. Egypt. Inform. J. (2015).[3] U. Acharya et al. 2018. Automated EEG-based screening of depression using deep convolutional neural network.

Computer methods and programs in biomedicine (2018).[4] U. Acharya et al. 2018. Deep convolutional neural network for the automated detection and diagnosis of seizure using

EEG signals. Computers in biology and medicine (2018).[5] M. Ahn et al. 2015. Performance variation in motor imagery brain–computer interface: a brief review. Journal of

neuroscience methods (2015).[6] Alaa M Al-kaysi, Ahmed Al-Ani, and Tjeerd W Boonstra. 2015. A multichannel deep belief network for the classi�cation

of EEG data. In ICONIP.[7] S. Alhagry et al. 2017. Emotion Recognition based on EEG using LSTM Recurrent Neural Network. Emotion (2017).[8] M. Almogbel et al. 2018. EEG-signals based cognitive workload detection of vehicle driver using deep learning. In

ICACT.[9] A. Ansari et al. 2018. Neonatal seizure detection using deep convolutional neural networks. Int J Neural Syst. (2018).

[10] A. Antoniades et al. 2016. Deep learning for epileptic intracranial EEG data. In MLSP.[11] A. Antoniades et al. 2018. Deep neural architectures for mapping scalp to intracranial EEG. Int J Neural Syst. (2018).[12] P Arico et al. 2018. Passive BCI beyond the lab: current trends and future directions. Physiological measurement (2018).[13] M. Arjovsky et al. 2017. Wasserstein generative adversarial networks. In ICML.[14] M. A�ia et al. 2018. A time domain classi�cation of steady-state visual evoked potentials using deep recurrent-

convolutional neural networks. In ISBI 2018.[15] N. Aznan et al. 2018. On the Classi�cation of SSVEP-Based Dry-EEG Signals via Convolutional Neural Networks.

arXiv preprint arXiv:1805.04157 (2018).[16] V. Baltatzis et al. 2017. Bullying incidences identi�cation within an immersive environment using HD EEG-based

analysis: A Swarm Decomposition and Deep Learning approach. Scienti�c reports (2017).[17] A.C. Banumathi et al. 2017. Deep Learning Architectures, Algorithms for Speech Recognition: An Overview. Interna-

tional Journal of Advanced Research in Computer Science and So�ware Engineering (2017).[18] A. Bashashati et al. 2007. A survey of signal processing algorithms in brain–computer interfaces based on electrical

brain signals. J NEURAL ENG (2007).[19] P. Bashivan et al. 2015. Single trial prediction of normal and excessive cognitive load through EEG feature fusion. In

SPMB.[20] P. Bashivan et al. 2016. Learning representations from EEG with deep recurrent-convolutional neural networks. ICLR

(2016).[21] P. Bashivan et al. 2016. Mental state recognition via Wearable EEG. arXiv preprint arXiv:1602.00985 (2016).[22] J. Behncke et al. 2018. �e signature of robot action success in EEG signals of a human observer: Decoding and

visualization using deep convolutional neural networks. In BCI.[23] S. Biswal et al. 2017. SLEEPNET: automated sleep staging system via deep learning. arXiv preprint arXiv:1707.08262

(2017).[24] E. Carabez et al. 2017. Identifying Single Trial Event-Related Potentials in an Earphone-Based Auditory Brain-Computer

Interface. Applied Sciences (2017).[25] H. Ceco�i et al. 2011. Convolutional neural networks for P300 detection with application to brain-computer interfaces.

TPAMI (2011).[26] H. Ceco�i et al. 2014. Single-trial classi�cation of event-related potentials in rapid serial visual presentation tasks

using supervised spatial �ltering. TNNLS (2014).

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 30: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1:30 Xiang Zhang, et al.

[27] H. Ceco�i et al. 2017. Best practice for single-trial detection of event-related potentials: Application to brain-computerinterfaces. International Journal of Psychophysiology (2017).

[28] H. Ceco�i et al. 2017. Convolutional neural networks for event-related potential detection: impact of the architecture.In EMBC.

[29] R. Chai et al. 2017. Improving eeg-based driver fatigue classi�cation using sparse-deep belief networks. Front Neurosci(2017).

[30] X. Chai et al. 2016. Unsupervised domain adaptation techniques based on auto-encoder for non-stationary EEG-basedemotion recognition. Computers in biology and medicine (2016).

[31] S. Chambon et al. 2018. A deep learning architecture for temporal sleep stage classi�cation using multivariate andmultimodal time series. TNSRE (2018).

[32] A. Chiarelli et al. 2018. Deep learning for hybrid EEG-fNIRS brain–computer interface: application to motor imageryclassi�cation. J NEURAL ENG (2018).

[33] L. Chu et al. 2017. Individual recognition in schizophrenia using deep learning methods with random forest and votingclassi�ers: Insights from resting state EEG streams. arXiv preprint arXiv:1707.03467 (2017).

[34] R. Cichy et al. 2016. Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual objectrecognition reveals hierarchical correspondence. Scienti�c reports (2016).

[35] M. Dai et al. 2019. EEG Classi�cation of Motor Imagery Using a Novel Deep Learning Framework. Sensors (2019).[36] Li Deng. 2012. �ree classes of deep learning architectures and their applications: a tutorial survey. APSIPA transactions

on signal and information processing (2012).[37] L. Deng et al. 2014. Deep learning: methods and applications. Foundations and Trends® in Signal Processing (2014).[38] L. Deng et al. 2014. A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA

Transactions on Signal and Information Processing (2014).[39] H. Dong et al. 2018. Mixed neural network approach for temporal sleep stage classi�cation. TNSRE (2018).[40] L. Du et al. 2017. Detecting driving fatigue with multimodal deep learning. In NER.[41] L. Duan et al. 2016. Classi�cation Based on Multilayer Extreme Learning Machine for Motor Imagery Task from EEG

signals. Procedia Computer Science (2016).[42] M. Fatourechi et al. 2007. EMG and EOG artifacts in brain computer interface systems: A survey. Clinical neurophysi-

ology (2007).[43] Isaac Fernandez-Varela, Dimitrios Athanasakis, Samuel Parsons, Elena Hernandez-Pereira, and Vicente Moret-Bonillo.

Sleep staging with deep learning: a convolutional model. In ESANN 2018.[44] L. Fraiwan et al. 2017. Neonatal sleep state identi�cation using deep learning autoencoders. In CSPA.[45] A. Frydenlund et al. 2015. Emotional a�ect estimation using video and EEG data in deep neural networks. In Canadian

Conference on Arti�cial Intelligence.[46] W. Gao et al. 2015. Multi-ganglion ANN based feature learning with application to P300-BCI signal classi�cation.

Biomedical Signal Processing and Control (2015).[47] Y. Gao et al. 2015. Deep learninig of EEG signals for emotion recognition. In ICMEW.[48] P. Garg et al. 2017. Automatic 1D convolutional neural network-based detection of artifacts in MEG acquired without

electrooculography or electrocardiography. In PRNI.[49] M. Golmohammadi et al. 2017. Deep Architectures for Automated Seizure Detection in Scalp EEGs. arXiv preprint

arXiv:1712.09776 (2017).[50] Y. Gordienko et al. 2017. Deep learning for fatigue estimation on the basis of multimodal human-machine interactions.

arXiv preprint arXiv:1801.06048 (2017).[51] S. Gordon et al. 2017. Real world BCI: cross-domain learning and practical applications. In Proceedings of the 2017

ACM Workshop on An Application-oriented Approach to BCI out of the laboratory.[52] Q. Gui et al. 2019. A Survey on Brain Biometrics. Comput. Surveys (2019).[53] A. HACHEM et al. 2014. E�ect of fatigue on ssvep during virtual wheelchair navigation. JATIT (2014).[54] A. Haider et al. 2017. Application of P300 Event-Related Potential in Brain-Computer Interface. In Event-Related

Potentials and Evoked Potentials.[55] M. Hajinoroozi et al. 2015. Feature extraction with deep belief networks for driver’s cognitive states prediction from

EEG data. In ChinaSIP.[56] M. Hajinoroozi et al. 2015. Prediction of driver’s drowsy and alert states from EEG signals with deep learning. In

CAMSAP.[57] M. Hajinoroozi et al. 2017. Deep Transfer Learning for Cross-subject and Cross-experiment Prediction of Image Rapid

Serial Visual Presentation Events from EEG Data. In International Conference on Augmented Cognition.[58] C. Han et al. 2018. GAN-based synthetic brain MR image generation. In ISBI 2018.[59] K. Hartmann et al. 2018. Hierarchical internal representation of spectral features in deep convolutional networks

trained for EEG decoding. In BCI.

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 31: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

A Survey on Deep Learning based Brain-Computer Interface 1:31

[60] A. Hasasneh et al. 2018. Deep Learning Approach for Automatic Classi�cation of Ocular and Cardiac Artifacts inMEG Data. Journal of Engineering (2018).

[61] M. Havaei et al. 2017. Brain tumor segmentation with deep neural networks. Medical image analysis (2017).[62] J. Hennrich et al. 2015. Investigating deep learning for fNIRS based BCI.. In EMBC.[63] Luis G Hernandez, Oscar Martinez Mozos, Jose M Ferrandez, and Javier M Antelis. 2018. EEG-Based Detection of

Braking Intention Under Di�erent Car Driving Conditions. Frontiers in neuroinformatics (2018).[64] T. Hiroyasu et al. 2014. Gender classi�cation of subjects from cerebral blood �ow changes using Deep Learning. In

CIDM.[65] M. Hosseini et al. 2017. Cloud-based deep learning of big eeg data for epileptic seizure prediction. arXiv preprint

arXiv:1702.05192 (2017).[66] M. Hosseini et al. 2017. Deep learning with edge computing for localization of epileptogenicity using multimodal

rs-fMRI and EEG big data. In ICAC.[67] M. Hosseini et al. 2017. Optimized deep learning for EEG big data and seizure prediction BCI via internet of things.

IEEE Transactions on Big Data (2017).[68] C. Hu et al. 2016. Clinical decision support for Alzheimer’s disease based on deep learning and brain network. In ICC.[69] D. Huang et al. 2012. Electroencephalography (EEG)-based brain–computer interface (BCI): A 2-D virtual wheelchair

control based on event-related desynchronization/synchronization and state control. TNSRE (2012).[70] Y. Hung et al. 2017. Brain dynamic states analysis based on 3D convolutional neural network. In SMC.[71] G. Huve et al. 2017. Brain activity recognition with a wearable fNIRS using neural networks. In ICMA.[72] G. Huve et al. 2018. Brain-computer interface using deep neural network and its application to mobile robot control.

In AMC.[73] X. Jia et al. 2014. A novel semi-supervised deep learning framework for a�ective state recognition on eeg signals. In

BIBE.[74] L. Jingwei et al. 2015. Deep learning EEG response representation for brain computer interface. In CCC.[75] A. Johansen et al. 2016. Epileptiform spike detection via convolutional neural networks. In ICASSP.[76] I. Kavasidis et al. 2017. Brain2image: Converting brain signals into images. In MM.[77] K. Kawasaki et al. 2015. Visualizing extracted feature by deep learning in P300 discrimination task. In SoCPaR.[78] P. Kawde et al. 2017. Deep belief network based a�ect recognition from physiological signals. In UPCON.[79] P. Khurana et al. 2016. Class-wise deep dictionaries for EEG classi�cation. In IJCNN.[80] Isabell Kiral-Kornek, Subhrajit Roy, Ewan Nurse, Benjamin Mashford, Philippa Karoly, �omas Carroll, Daniel Payne,

Susmita Saha, Steven Baldassano, Terence O’Brien, and others. 2018. Epileptic seizure prediction using big data anddeep learning: toward a mobile system. EBioMedicine (2018).

[81] Toshiaki Koike-Akino, Ruhi Mahajan, Tim K Marks, Ye Wang, Shinji Watanabe, Oncel Tuzel, and Philip Orlik. 2016.High-accuracy user identi�cation using EEG biometrics. In EMBC.

[82] S. Koyamada et al. 2015. Deep learning of fMRI big data: a novel approach to subject-transfer decoding. arXiv preprintarXiv:1502.00093 (2015).

[83] J. Kulasingham et al. 2016. Deep belief networks and stacked autoencoders for the P300 Guilty Knowledge Test. InIECBES.

[84] S. Kumar et al. 2016. A deep learning approach for motor imagery EEG signal classi�cation. In APWC on CSE.[85] N. Kwak et al. 2017. A convolutional neural network for steady state visual evoked potential classi�cation under

ambulatory environment. PloS one (2017).[86] V. Lawhern et al. 2018. EEGNet: a compact convolutional neural network for EEG-based brain–computer interfaces. J

NEURAL ENG (2018).[87] H. Lee et al. 2018. A convolution neural networks scheme for classi�cation of motor imagery EEG based on wavelet

time-frequecy image. In ICOIN.[88] E. Leuthardt et al. 2009. Evolution of brain-computer interfaces: going beyond classic motor physiology. Neurosurgical

focus (2009).[89] J. Li et al. 2014. Deep learning of multifractal a�ributes from motor imagery induced EEG. In ICONIP.[90] J. Li et al. 2015. Feature learning from incomplete EEG with denoising autoencoder. Neurocomputing (2015).[91] J. Li et al. 2016. Implementation of eeg emotion recognition system based on hierarchical convolutional neural

networks. In International Conference on Brain Inspired Cognitive Systems.[92] J. Li et al. 2017. Hierarchical convolutional neural networks for EEG-based emotion recognition. Cognitive Computation

(2017).[93] K. Li et al. 2013. A�ective state recognition from EEG with deep belief networks. In BIBM.[94] P. Li et al. 2016. Single-channel EEG-based mental fatigue detection based on deep belief network. In EMBC.[95] R. Li et al. 2014. Deep learning based imaging data completion for improved brain disease diagnosis. In MICCAI.[96] X. Li et al. 2015. EEG based emotion identi�cation using unsupervised deep feature learning. (2015).

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 32: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1:32 Xiang Zhang, et al.

[97] Q. Lin et al. 2016. Classi�cation of epileptic EEG signals with stacked sparse autoencoder based on deep learning. InInternational Conference on Intelligent Computing.

[98] Z. Lin et al. 2017. Method for enhancing single-trial P300 detection by introducing the complexity degree of imageinformation in rapid serial visual presentation tasks. PloS one (2017).

[99] G. Litjens et al. 2017. A survey on deep learning in medical image analysis. Medical image analysis (2017).[100] J. Liu et al. 2018. Applications of deep learning to mri images: a survey. Big Data Mining and Analytics (2018).[101] M. Liu et al. 2018. Deep learning based on Batch Normalization for P300 signal detection. Neurocomputing (2018).[102] Q. Liu et al. 2017. Deep Belief Networks for EEG-Based Concealed Information Test. In ISNN.[103] W. Liu et al. 2016. Emotion recognition using multimodal deep learning. In ICONIP.[104] W. Liu et al. 2017. Analyze EEG Signals with Convolutional Neural Network Based on Power Spectrum Feature

Selection. Proceedings of Science (2017).[105] F. Lo�e et al. 2007. A review of classi�cation algorithms for EEG-based brain–computer interfaces. J NEURAL ENG

(2007).[106] F. Lo�e et al. 2018. A review of classi�cation algorithms for EEG-based brain–computer interfaces: a 10 year update.

J NEURAL ENG (2018).[107] N. Lu et al. 2017. A deep learning scheme for motor imagery classi�cation based on restricted boltzmann machines.

TNSRE (2017).[108] T. Ma et al. 2017. �e extraction of motion-onset VEP BCI features based on deep learning and compressed sensing.

Journal of neuroscience methods (2017).[109] R. Maddula et al. 2017. Deep recurrent convolutional neural networks for classifying P300 BCI signals. In Proceedings

of the 7th Graz Brain-Computer Interface Conference, Graz, Austria.[110] M. Mahmud et al. 2018. Applications of deep learning and reinforcement learning to biological data. TNNLS (2018).[111] R. Manor et al. 2015. Convolutional neural network for multi-category rapid serial visual presentation bci. Frontiers

in computational neuroscience (2015).[112] R. Manor et al. 2016. Multimodal neural network for rapid serial visual presentation brain computer interface.

Frontiers in computational neuroscience (2016).[113] M. Manzano et al. 2017. Combination of EEG Data Time and Frequency Representations in Deep Networks for Sleep

Stage Classi�cation. In International Conference on Intelligent Computing.[114] Z. Mao et al. 2014. Classi�cation of non-time-locked rapid serial visual presentation events for brain-computer

interaction using deep learning. In ChinaSIP.[115] Z. Mao et al. 2016. Deep learning for rapid serial visual presentation event from electroencephalography signal. Ph.D.

Dissertation. �e University of Texas at San Antonio.[116] Z. Mao et al. 2017. EEG-based biometric identi�cation with deep learning. In NER.[117] Lopez Marc Moreno. 2017. Deep learning for brain tumor segmentation. Master diss. University of Colorado Colorado

Springs. (2017).[118] S. Mason et al. 2007. A comprehensive survey of brain interface technology designs. ANN BIOMED ENG (2007).[119] K. Mikkelsen et al. 2015. EEG recorded from the ear: Characterizing the ear-EEG method. Front Neurosci (2015).[120] S. Min et al. 2017. Deep learning in bioinformatics. Brie�ngs in bioinformatics (2017).[121] Juan Abdon Mioranda-Correa and Ioannis Patras. 2018. A Multi-Task Cascaded Network for Prediction of A�ect,

Personality, Mood and Social Context Using EEG Signals. In FG 2018.[122] F. Morabito et al. 2016. Deep convolutional neural networks for classi�cation of mild cognitive impaired and

Alzheimer’s disease patients from scalp EEG recordings. In RTSI.[123] F. Morabito et al. 2017. Deep learning representation from electroencephalography of Early-Stage Creutzfeldt-Jakob

disease and features for di�erentiation from rapidly progressive dementia. Int J Neural Syst. (2017).[124] F. Movahedi et al. 2018. Deep belief networks for electroencephalography: A review of recent contributions and

future outlooks. JBHI (2018).[125] S. Narejo et al. 2016. EEG based eye state classi�cation using deep belief network and stacked AutoEncoder. IJECE

(2016).[126] N. Naseer et al. 2015. fNIRS-based brain-computer interfaces: a review. Frontiers in human neuroscience (2015).[127] N. Naseer et al. 2016. Analysis of di�erent classi�cation techniques for two-class functional near-infrared spectroscopy-

based brain-computer interface. Computational intelligence and neuroscience (2016).[128] E. Nurse et al. 2015. A generalizable brain-computer interface (bci) using machine learning for feature discovery.

PloS one (2015).[129] E. Nurse et al. 2016. Decoding EEG and LFP signals using deep learning: heading TrueNorth. In Computing Frontiers.[130] A. Ortiz et al. 2016. Ensembles of deep learning architectures for the early diagnosis of the Alzheimer’s disease. Int J

Neural Syst. (2016).

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 33: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

A Survey on Deep Learning based Brain-Computer Interface 1:33

[131] M. Pacharra et al. 2017. Concealed around-the-ear EEG captures cognitive processing in a visual simon task. Frontiersin human neuroscience (2017).

[132] A. Page et al. 2014. Comparing Raw Data and Feature Extraction for Seizure Detection with Deep Learning Methods..In FLAIRS Conference.

[133] S. Palazzo et al. 2017. Generative adversarial networks conditioned by brain signals. In ICCV.[134] C. Pandarinath et al. 2017. High performance communication by people with paralysis using an intracortical

brain-computer interface. Elife (2017).[135] R. Parasuraman et al. 2012. Individual di�erences in cognition, a�ect, and performance: Behavioral, neuroimaging,

and molecular genetic approaches. Neuroimage (2012).[136] JL Perez-Benıtez, JA Perez-Benıtez, and JH Espina-Hernandez. 2018. Development of a brain computer interface

interface using multi-frequency visual stimulation and deep neural networks. In CONIELECOMP.[137] G. Pfurtscheller et al. 2001. Motor imagery and direct brain-computer communication. Proc. IEEE (2001).[138] S. Plis et al. 2014. Deep learning for neuroimaging: a validation study. Front Neurosci (2014).[139] M. Pu�en et al. 2018. Predicting sex from brain rhythms with deep learning. Scienti�c reports (2018).[140] A. Radford et al. 2016. Unsupervised representation learning with deep convolutional generative adversarial networks.

ICLR (2016).[141] T. Reddy et al. 2016. Online Eye state recognition from EEG data using Deep architectures. In SMC.[142] S. Redkar et al. 2015. Using Deep Learning for Human Computer Interface via Electroencephalography. IAES

International Journal of Robotics and Automation (2015).[143] D. Regan et al. 1977. Steady-state evoked potentials. JOSA (1977).[144] Y. Ren et al. 2014. Convolutional deep belief networks for feature extraction of EEG signal. In IJCNN.[145] G. Ru�ni et al. 2016. EEG-driven RNN classi�cation for prognosis of neurodegeneration in at-risk patients. In

International Conference on Arti�cial Neural Networks.[146] S. Ruiz et al. 2014. Real-time fMRI brain computer interfaces: self-regulation of single brain regions to networks.

Biological psychology (2014).[147] S. Sakhavi et al. 2015. Parallel convolutional-linear neural network for motor imagery classi�cation. In EUSIPCO.[148] W. Samek et al. 2012. Brain-computer interfacing in discriminative and stationary subspaces. In EMBC.[149] P. San et al. 2016. EEG-based driver fatigue detection using hybrid deep generic model. In EMBC.[150] S. Sarkar et al. 2016. Wearable EEG-based activity recognition in PHM-related service environment via deep learning.

Int. J. Progn. Health Manag (2016).[151] S. Sarraf et al. 2016. Deep learning-based pipeline to recognize Alzheimer’s disease using fMRI data. In FTC.[152] R. Schirrmeister et al. 2017. Deep learning with convolutional neural networks for decoding and visualization of EEG

pathology. In SPMB.[153] J. Schmidhuber et al. 2015. Deep learning in neural networks: An overview. Neural networks (2015).[154] K. Seeliger et al. 2018. Generative adversarial networks for reconstructing natural images from brain activity.

NeuroImage (2018).[155] V. Shah et al. 2017. Optimizing channel selection for seizure detection. In SPMB.[156] M. Shahin et al. 2017. Deep Learning and Insomnia: Assisting Clinicians With �eir Diagnosis. JBHI (2017).[157] J. Shamwell et al. 2016. Single-trial EEG RSVP classi�cation using convolutional neural networks. In Micro-and

Nanotechnology Sensors, Systems, and Applications VIII.[158] A. Shanbhag et al. 2017. P300 analysis using deep neural network. In ICECDS.[159] J. Shang et al. 2017. Cognitive load recognition using multi-channel complex network method. In ISNN.[160] G. Shen et al. 2019. Deep image reconstruction from human brain activity. PLoS computational biology (2019).[161] V. Shreyas et al. 2017. A deep learning architecture for brain tumor segmentation in MRI images. In MMSP.[162] M. Shu et al. 2013. Sparse autoencoders for word decoding from magnetoencephalography. In MLINI.[163] SK. Singh et al. 2018. Study for Reduction of Pollution Level in Diesel Engines, Petrol Engines and Generator Sets by

Bio signal Ring. International Journal of Advance Research and Innovation (2018).[164] S. Soekadar et al. 2015. Brain–machine interfaces in neurorehabilitation of stroke. Neurobiology of disease (2015).[165] A. Solon et al. 2017. Deep Learning Approaches for P300 Classi�cation in Image Triage: Applications to the NAILS

Task. In NTCIR.[166] A. Sors et al. 2018. A convolutional neural network for sleep stage scoring from raw single-channel eeg. Biomedical

Signal Processing and Control (2018).[167] C. Spampinato et al. 2017. Deep learning human mind for automated visual classi�cation. In CVPR.[168] A. Sternin et al. 2015. Tempo estimation from the eeg signal during perception and imagination of music. In Plymouth.[169] S. Stober et al. 2014. Classifying EEG Recordings of Rhythm Perception.. In ISMIR.[170] S. Stober et al. 2014. Using Convolutional Neural Networks to Recognize Rhythm Stimuli from Electroencephalography

Recordings. In Advances in neural information processing systems.

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 34: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

1:34 Xiang Zhang, et al.

[171] S. Stober et al. 2015. Deep feature learning for EEG recordings. arXiv preprint arXiv:1511.04306 (2015).[172] I. Sturm et al. 2016. Interpretable deep neural networks for single-trial EEG classi�cation. Journal of neuroscience

methods (2016).[173] N. Suhaimi et al. 2015. Studies on classi�cation of fMRI data using deep learning approach. (2015).[174] H. Suk et al. 2015. Deep learning in diagnosis of brain disorders. In Recent Progress in Brain and Cognitive Engineering.[175] H. Suk et al. 2016. State-space model with deep learning for functional dynamics estimation in resting-state fMRI.

NeuroImage (2016).[176] A. Supratak et al. 2017. DeepSleepNet: a model for automatic sleep stage scoring based on raw single-channel EEG.

TNSRE (2017).[177] Y. Tabar et al. 2016. A novel deep learning approach for classi�cation of EEG motor imagery signals. J NEURAL ENG

(2016).[178] S. Talathi et al. 2017. Deep Recurrent Neural Networks for seizure detection and early seizure detection systems.

arXiv preprint arXiv:1706.03283 (2017).[179] C. Tan et al. 2017. Multimodal Classi�cation with Deep Convolutional-Recurrent Neural Networks for Electroen-

cephalography. In ICONIP.[180] D. Tan et al. 2015. Sleep spindle detection using deep learning: a validation study based on crowdsourcing. In EMBC.[181] Z. Tang et al. 2017. Single-trial EEG classi�cation of motor imagery using deep convolutional neural networks.

Optik-International Journal for Light and Electron Optics (2017).[182] J. Teo et al. 2017. Deep learning for EEG-Based preference classi�cation. In AIP Conference Proceedings.[183] J. �omas et al. 2017. Deep learning-based classi�cation for brain-computer interfaces. In SMC.[184] O. Tsinalis et al. 2016. Automatic sleep stage scoring with single-channel EEG using convolutional neural networks.

arXiv preprint arXiv:1610.01683 (2016).[185] K. Tsiouris et al. 2018. A Long Short-Term Memory deep learning network for the prediction of epileptic seizures

using EEG signals. Computers in biology and medicine (2018).[186] T. Tu et al. 2018. Relating deep neural network representations to EEG-fMRI spatiotemporal dynamics in a perceptual

decision-making task. In CVPRW.[187] J. Turner et al. 2014. Deep belief networks used on high resolution multichannel electroencephalography data for

seizure detection. In 2014 AAAI Spring Symposium Series.[188] T. Uktveris et al. 2017. Application of convolutional neural networks to four-class motor imagery classi�cation

problem. Information Technology And Control (2017).[189] I. Ullah et al. 2018. An automated system for epilepsy detection using EEG brain signals based on deep learning

approach. Expert Systems with Applications (2018).[190] L. Vareka et al. 2017. Stacked Autoencoders for the P300 Component Detection. Front Neurosci (2017).[191] S. Vieira et al. 2017. Using deep learning to investigate the neuroimaging correlates of psychiatric and neurological

disorders: Methods and applications. Neuroscience & Biobehavioral Reviews (2017).[192] A. Vilamala et al. 2017. Neural Networks for Interpretable Analysis of EEG Sleep Stage Scoring. In MLSP.[193] Martin Volker, Robin T Schirrmeister, Lukas DJ Fiederer, Wolfram Burgard, and Tonio Ball. 2018. Deep transfer

learning for error decoding from non-invasive EEG. In BCI.[194] F. Wang et al. 2018. Data Augmentation for EEG-Based Emotion Recognition with Deep Convolutional Neural

Networks. In International Conference on Multimedia Modeling.[195] J. Wang et al. 2018. Deep learning for sensor-based activity recognition: A survey. Pa�ern Recognition Le�ers (2018).[196] K. Wang et al. 2016. Research on healthy anomaly detection model based on deep learning from multiple time-series

physiological signals. Scienti�c Programming (2016).[197] Q. Wang et al. 2017. Multi-channel EEG Classi�cation Based on Fast Convolutional Feature Extraction. In ISNN.[198] X. Wang et al. 2016. A Survey of the BCI and its Application Prospect. In �eory, Methodology, Tools and Applications

for Modeling and Simulation of Complex Systems.[199] N. Waytowich et al. 2018. Compact Convolutional Neural Networks for Classi�cation of Asynchronous Steady-state

Visual Evoked Potentials. arXiv preprint arXiv:1803.04566 (2018).[200] D. Wen et al. 2018. Deep Learning Methods to Process fMRI Data and �eir Application in the Diagnosis of Cognitive

Impairment: A Brief Overview and Our Opinion. Frontiers in neuroinformatics (2018).[201] T. Wen et al. 2018. Deep Convolution Neural Network and Autoencoders-Based Unsupervised Feature Learning of

EEG Signals. IEEE Access (2018).[202] B. Xia et al. 2015. Electrooculogram based sleep stage classi�cation using deep belief network. In IJCNN.[203] Z. Xie et al. 2018. Decoding of �nger trajectory from ECoG using deep learning. J NEURAL ENG (2018).[204] H. Xu et al. 2016. A�ective states classi�cation using EEG and semi-supervised deep learning approaches. In MMSP.[205] H. Xu et al. 2016. EEG-based a�ect states classi�cation using deep belief networks. In DMIAF.

, Vol. 1, No. 1, Article 1. Publication date: January 2018.

Page 35: XIANG ZHANG, arXiv:1905.04149v3 [cs.HC] 15 Jun 2019points out the emerging challenges and future directions. Finally, Section 8 gives the concluding remarks. 2 GENERAL BCI SYSTEM Figure

A Survey on Deep Learning based Brain-Computer Interface 1:35

[206] C. Yang et al. 1989. Early smooth horizontal eye movement: a favorable prognostic sign in patients with locked-insyndrome. Archives of physical medicine and rehabilitation (1989).

[207] H. Yang et al. 2015. On the use of convolutional neural networks and augmented CSP features for multi-class motorimagery of EEG signals classi�cation. In EMBC.

[208] R. Yannick et al. 2019. Deep learning-based electroencephalography analysis: a systematic review. arXiv preprintarXiv:1901.05498 (2019).

[209] A. Yepes et al. 2017. Improving classi�cation accuracy of feedforward neural networks for spiking neuromorphicchips. arXiv preprint arXiv:1705.07755 (2017).

[210] E. Yin et al. 2015. A hybrid brain–computer interface based on the fusion of P300 and SSVEP scores. TNSRE (2015).[211] Z. Yin et al. 2017. Cross-session classi�cation of mental workload levels using EEG and an adaptive deep learning

model. Biomedical Signal Processing and Control (2017).[212] Z. Yin et al. 2017. Recognition of emotions using multimodal physiological signals and an ensemble deep learning

model. Computer methods and programs in biomedicine (2017).[213] J. Yoon et al. 2018. Spatial and Time Domain Feature of ERP Speller System Extracted via Convolutional Neural

Network. Computational intelligence and neuroscience (2018).[214] Y. Yuan et al. 2017. A novel wavelet-based model for eeg epileptic seizure detection using multi-context learning. In

BIBM.[215] Y. Yuan et al. 2018. A novel channel-aware a�ention framework for multi-channel EEG seizure detection via

multi-view deep learning. In BHI.[216] J. Zhang et al. 2016. Automatic sleep stage classi�cation based on sparse deep belief net and combination of multiple

classi�ers. Transactions of the Institute of Measurement and Control (2016).[217] P. Zhang et al. 2018. Multi-channel generative adversarial network for parallel magnetic resonance image reconstruc-

tion in k-space. In MICCAI.[218] T. Zhang et al. 2018. Spatial-temporal recurrent neural network for emotion recognition. IEEE transactions on

cybernetics (2018).[219] X. Zhang et al. 2017. DeepKey: An EEG and Gait Based Dual-Authentication System. arXiv preprint arXiv:1706.01606

(2017).[220] X. Zhang et al. 2017. Intent recognition in smart living through deep recurrent neural networks. In ICONIP.[221] X. Zhang et al. 2017. Multi-person brain activity recognition via comprehensive eeg signal analysis. In Mobiquitous.[222] X. Zhang et al. 2018. Brain2Object: Printing Your Mind from Brain Signals with Spatial Correlation Embedding.

arXiv preprint arXiv:1810.02223 (2018).[223] X. Zhang et al. 2018. Converting your thoughts to texts: Enabling brain typing via deep feature learning of EEG

signals. In PerCom.[224] X. Zhang et al. 2018. Internet of �ings Meets Brain-Computer Interface: A Uni�ed Deep Learning Framework for

Enabling Human-�ing Cognitive Interactivity. IEEE Internet of �ings Journal (2018).[225] X. Zhang et al. 2018. MindID: Person Identi�cation from Brain Waves through A�ention-based Recurrent Neural

Network. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (2018).[226] X. Zhang et al. 2018. Multi-modality sensor data classi�cation with selective a�ention. IJCAI (2018).[227] X. Zhang et al. 2019. Adversarial Variational Embedding for Robust Semi-supervised Learning. In SIGKDD 2019.[228] Y. Zhao et al. 2014. Deep learning in the EEG diagnosis of Alzheimer’s disease. In ACCV.[229] W. Zheng et al. 2014. EEG-based emotion classi�cation using deep belief networks. In ICME.[230] W. Zheng et al. 2015. Investigating critical frequency bands and channels for EEG-based emotion recognition with

deep neural networks. IEEE Transactions on Autonomous Mental Development (2015).[231] W. Zheng et al. 2015. Revealing critical channels and frequency bands for emotion recognition from EEG with deep

belief network. In NER.[232] W. Zheng et al. 2016. Personalizing EEG-based a�ective models with transfer learning. In IJCAI.

, Vol. 1, No. 1, Article 1. Publication date: January 2018.