10
 Available online at www.sciencedirect.com Medical Engineering & Physics 30 (2008) 248–257 Assessment and comparison of different methods for heartbeat classication I. Jekova a,, G. Bortol an b , I. Christov a a Centre of Biomedical Engineering, Bulgarian Academy of Sciences, Soa, Bulgaria b  Institute of Biomedical Engineering ISIB, CNR, Padova, Italy Received 5 May 2006; received in revised form 5 February 2007; accepted 7 February 2007 Abstract The most common way to diagnose cardiac dysfunctions is the ECG signal analysis, usually starting with the assessment of the QRS complex as the most signicant wave in the electrocardiogram. Many methods for automatic heartbeats classication have been applied and reported in the literature but the use of different ECG features and the training and testing on different datasets, makes their direct comparison questionable. This paper presents a comparative study of the learning capacity and the classication abilities of four classication methods – K th nea res t nei ghbour rul e, neural net works , discr imi nant ana lys is and fuz zy logic. The y were applie d on 26 mor phologica l par amete rs, whi ch include information of amplitude, area, interval durations and the QRS vector in a VCG plane and were tested for ve types of ventricular complexes – normal heart beats, premature ventricular contractions, left and right bundled branch blocks, and paced beats. One global, one basic and two local learning sets were used. A small-sized learning set, containing the ve types of QRS complexes collected from all patients in the MIT-BIH database, was used either with or without applying the leave one out rule, thus representing the global and the basic learning set, respectively. The local learning sets consisted of heartbeats only from the tested patient, which were taken either consecutively or randomly . Using the local learning sets the assessed methods achieved high accuracies, while the small size of the basic learning set was balanced by reduced classication ability. Expectedly, the worst results were obtained with the global learning set. © 2007 IPEM. Published by Elsevier Ltd. All rights reserved. Keywords: Automatic heartbeat classication; K th nearest neighbour rule; Neural networks; Discriminant analysis; Fuzzy logic 1. Introducti on The mos t common way to study and diagnose car diac dys - functions is the ECG signal analysis usually starting with the assessment of the QRS complex as the most signicant wave in the electrocardiogram. The normal ventricular com- plexes (N) are provoked by the sinus node and are related with regular conduction path through the ventricles, which assures their normal narro w wav eform. The existenc e of ect opi c center s, as wel l as, some blocked regions in the ventricles, changes the path of propagation of the activa- tion front and leads to generation of QRS complexes with wide and bizarre waveforms related to premature ventric- Corresponding author at: Acad. G. Bonchev Str. bl. 105, 1113 Soa, Bulgaria. Tel.: +359 2 979 3631; fax: +359 2 723787.  E-mail address: [email protected] (I. Jekov a). ular contractions (PVC) and left and right bundle branch blocks (LBBB, RBBB). Another type of ventricular com- plexes, which feature with a particular QRS behavior are the heart beatsprovoked by pace maker the so-called pace d beats (PB). The automatic detection and classication of ventric- ular beats considerably facilitates the analyses of long-term ECG Holter recordings. Therefore, the accuracy of the auto- matic heartbeat classication software is of great importance for the precise cardiac dysfunctions diagnosis. It depends on thr ee bas ic fac tor s– the use d hea rtbeat featur e set , the applie d classication method and the organization of the training strategy. Recently, a study compared the adequacy of morphologi- cal and time-frequency ECG descriptors for classication of the QRS complexes in ve heartbeat classes [1]. An attempt to esti mate the inuence of the appl ied cl assi er and the used learning set on the classication accuracy was performed in 1350-4533/$ – see front matter © 2007 IPEM. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.medengphy.2007.02.003

Heartbeat Classification

Embed Size (px)

Citation preview

Page 1: Heartbeat Classification

7/28/2019 Heartbeat Classification

http://slidepdf.com/reader/full/heartbeat-classification 1/10

 Available online at www.sciencedirect.com

Medical Engineering & Physics 30 (2008) 248–257

Assessment and comparison of different methodsfor heartbeat classification

I. Jekova a,∗, G. Bortolan b, I. Christov a

a Centre of Biomedical Engineering, Bulgarian Academy of Sciences, Sofia, Bulgariab  Institute of Biomedical Engineering ISIB, CNR, Padova, Italy

Received 5 May 2006; received in revised form 5 February 2007; accepted 7 February 2007

Abstract

The most common way to diagnose cardiac dysfunctions is the ECG signal analysis, usually starting with the assessment of the QRScomplex as the most significant wave in the electrocardiogram. Many methods for automatic heartbeats classification have been applied and

reported in the literature but the use of different ECG features and the training and testing on different datasets, makes their direct comparison

questionable. This paper presents a comparative study of the learning capacity and the classification abilities of four classification methods –

K th nearest neighbour rule, neural networks, discriminant analysis and fuzzy logic. They were applied on 26 morphological parameters, which

include information of amplitude, area, interval durations and the QRS vector in a VCG plane and were tested for five types of ventricular

complexes – normal heart beats, premature ventricular contractions, left and right bundled branch blocks, and paced beats.

One global, one basic and two local learning sets were used. A small-sized learning set, containing the five types of QRS complexes

collected from all patients in the MIT-BIH database, was used either with or without applying the leave one out rule, thus representing the

global and the basic learning set, respectively. The local learning sets consisted of heartbeats only from the tested patient, which were taken

either consecutively or randomly.

Using the local learning sets the assessed methods achieved high accuracies, while the small size of the basic learning set was balanced by

reduced classification ability. Expectedly, the worst results were obtained with the global learning set.

© 2007 IPEM. Published by Elsevier Ltd. All rights reserved.

Keywords: Automatic heartbeat classification; K th nearest neighbour rule; Neural networks; Discriminant analysis; Fuzzy logic

1. Introduction

The most common way to study and diagnose cardiac dys-

functions is the ECG signal analysis usually starting with

the assessment of the QRS complex as the most significant

wave in the electrocardiogram. The normal ventricular com-

plexes (N) are provoked by the sinus node and are related

with regular conduction path through the ventricles, whichassures their normal narrow waveform. The existence of 

ectopic centers, as well as, some blocked regions in the

ventricles, changes the path of propagation of the activa-

tion front and leads to generation of QRS complexes with

wide and bizarre waveforms related to premature ventric-

∗ Corresponding author at: Acad. G. Bonchev Str. bl. 105, 1113 Sofia,

Bulgaria. Tel.: +359 2 979 3631; fax: +359 2 723787.

 E-mail address: [email protected] (I. Jekova).

ular contractions (PVC) and left and right bundle branch

blocks (LBBB, RBBB). Another type of ventricular com-

plexes, which feature with a particular QRS behavior are the

heartbeatsprovoked by pacemaker – the so-called paced beats

(PB). The automatic detection and classification of ventric-

ular beats considerably facilitates the analyses of long-term

ECG Holter recordings. Therefore, the accuracy of the auto-

matic heartbeat classification software is of great importancefor the precise cardiac dysfunctions diagnosis. It depends on

three basic factors – the used heartbeat feature set, the applied

classification method and the organization of the training

strategy.

Recently, a study compared the adequacy of morphologi-

cal and time-frequency ECG descriptors for classification of 

the QRS complexes in five heartbeat classes [1]. An attempt

to estimate the influence of the applied classifier and the used

learning set on the classification accuracy was performed in

1350-4533/$ – see front matter © 2007 IPEM. Published by Elsevier Ltd. All rights reserved.

doi:10.1016/j.medengphy.2007.02.003

Page 2: Heartbeat Classification

7/28/2019 Heartbeat Classification

http://slidepdf.com/reader/full/heartbeat-classification 2/10

 I. Jekova et al. / Medical Engineering & Physics 30 (20 08) 248–257  249

[2], but it is limited only forN andPVC heart beats.Moreover,

one of the learning strategies in the latter work is applicable

only in the context of discrimination between N and PVC

complexes, since it relies on generation of artificial PVCs,

and cannot be applied for LBBB, RBBB and PB.

The literature contains information about various classi-

fication rules, including linear discriminators [3,4], neuralnetworks [5–8], fuzzy adaptive resonance theory mapping

[9], self-organized maps [10], operation on vectors in the

multidimensional space [11,12], etc. The selection of an

appropriate classification rule depends greatly on the specific

application, in sense of the particularities of the estimated

ECG descriptors, available computational resources and

eventual real-time operation.

Taking into account the wide variations in the heart-

beat waveforms among the individual patients, some authors

[1–3,12,13] are paying attention to the organization of the

classifiers’ learning dataset, considering two main train-

ing strategies: global learning set (built from a large ECG

database) and local learning set (when it is customized tothe tested patient). The global learning set is preferable in

many commercial applications, which avoid the need for

time-consuming human expert annotations, associated with

manual editing of patient-specific ECG records. However,

the cited authors reported significant improvement in case of 

a patient-adaptable local learning set. In this sense, specific

strategies are adopted for local learning in some arrhythmia

monitors, for example in intensive care rooms, where only

the normal or predominant beats are annotated/identified by

the cardiologist, and any irregular heartbeats, like premature

ventricular contractions, are classified automatically on the

basis of pattern differenceswith theannotated beats.It is obvi-ous that the size of the local learning set must be restricted

in terms of convenience, but there is not agreement in the

literature about its optimal size.

Manymethods for automatic heartbeatsclassificationhave

been applied and reported in the literature. However, the use

of different ECG feature sets, as well as, the training and

testing on different datasets, makes their direct comparison

questionable. In this paper, we present a comparative study of 

the classification abilities of four classification methods—the

K th nearest neighbour classification rule (Knn), neural net-

works (NN),discriminant analysis (DA) and fuzzylogic (FL).

They were applied on the large collection of morphological

QRS descriptors used by Christov and Bortolan [5]. We have

tested the performance of the above-defined four classifica-

tion methods for five heartbeat classes, which feature with

particular QRS behavior (N, PVC, LBBB, RBBB and PB),

as well as in dependence of the content and the size of the

learning set.

2. Materials and methods

2.1. ECG signals

The study involved all 48 ECG recordings from the MIT-

BIH arrhythmia database. Each recording has a duration of 

30 min and includes two leads—the modified limb lead II

and one of the modified leads V1, V2, V4 or V5 [14]. The

sampling frequency is 360Hz and the resolution is 200 sam-

ples per mV. The heartbeats were recognized by the fiducial

points in the database and the original database annotations

were accepted. The study was focused on the classification of 

the five largest heartbeat classes in the MIT-BIH arrhythmiadatabase: (i) normal beats (N) – about 74820 cases; (ii) pre-

mature ventricular contractions (PVC) – about 6970; (iii) left

bundle branch blocks (LBBB) – about 8050; (iv) right bundle

branch blocks (RBBB) – about 7220; (v) paced beats (PB) –

about 7000. Table 1 shows the distribution of this heartbeat

types among the ECG recordings. In our study, we analyzed

the N, PVC, LBBB, RBBB and PB in the full-length MIT-

BIH files, without selection based on the quality of the signal.

Thus the analysis was applied even in the presence of artifacts

or noise in the ECG signal, which could result in accuracy

reduction.

2.2. Preprocessing

The preprocessing filtration was taken from Christov and

Bortolan [5] and the following procedures are realized:

• a notch filter for elimination of the power-line interfer-

ence, implemented by moving averaging of samples in one

period of the interference;

• a low-pass filter for suppression of the tremor noise,

realized by moving averaging of samples in 30 ms time-

interval, thus having a first zero at about 35 Hz;

• a high-pass recursive filter for drift suppression [15] with

cut-off frequency of 2.2 Hz.

Table 1

Distribution of the N, PVC, LBBB, RBBB and PB heartbeats among the ECG recordings

Heartbeat type ECG recordings containing the respective heartbeat types

N 100–106, 108, 112–117, 119, 121–123, 200–203, 205, 208–210, 212, 213, 215, 217, 219–223, 228, 230, 231, 233, 234

PVC 100, 102, 104–109, 111, 114, 116, 118, 119, 121, 123, 124, 200–203, 205, 207–210, 213–215, 217, 219, 221, 223, 228,

230, 231, 233, 234

LBBB 109, 111, 207, 214

RBBB 118, 124, 207, 212, 231, 232

PB 102, 104, 107, 217

Page 3: Heartbeat Classification

7/28/2019 Heartbeat Classification

http://slidepdf.com/reader/full/heartbeat-classification 3/10

250 I. Jekova et al. / Medical Engineering & Physics 30 (2008) 248–257 

Fig.1. (a) Referencewaveformsfor thefive heartbeattypesunder classification(N, PVC,LBBB, RBBB, PB);(b andc) graphicalillustration of themorphological

descriptors extracted from ECG leads and VCG plane, respectively.

2.3. Morphological descriptors

The method for calculation of the large collection of mor-

phological descriptors was applied to all QRS complexes

annotated as N, PVC, LBBB, RBBB or PB in the MIT-

BIH arrhythmia database (see the reference waveforms in

Fig. 1a). An isoelectric, baseline segment is searched for

by starting from the QRS fiducial point back on the time

axis up to 120 ms. The segment is found if eight successive

differences between adjacent samples are less than a preset

value [16] and the difference between the end samples of 

the segment is lower than the same value. The earliest sam-

ple of this segment is defined as an isoelectric point. The

QRS pattern recognition technique [16] identified the onset

and the offset of the QRS complex by simultaneous analysis

of the two ECG leads. Then a number of 26 morphological

descriptors representing information of the amplitude, area,

specific interval durationsand measurementsof theQRS vec-

tor in the vectorcardiographic(VCG) planewere calculated as

follows:

• Eleven descriptors were extracted by individual assess-

ment of each lead (their total number is 22 considering the

two leads):

(1) Pp: maximal amplitude of the positive peak.

(2) Pn: maximal amplitude of the negative peak.

(3) ArP: area of the positive samples in the identified

pattern.

(4) ArN: area of the negative samples in the identified

pattern.

(5) Ar: area of the QRS complex—the sum of the abso-

lute values of the ECG samples in the identified

pattern (Ar = ArP + ArN).

(6) Av: sum of the absolute values of the velocities in the

pattern interval.

(7) No: number of samples crossing a threshold of 70%

of the highest peak amplitude.

(8) Ima: time-interval from the QRS complex onset to

the maximal positive peak.

(9) Imi: time-interval from the QRS complex onset to

the maximal negative peak.

(10) S1: QRS slope velocity calculated for the time-

interval between the QRS complex onset and the first

peak.

(11) S2: QRS slope velocity calculated for the time-

interval between the first peak and the second peak.

• One descriptor representing the time-interval between the

onset and the offset of the ventricular contraction was

derived by the simultaneous leads analysis—Width.

• Three descriptors were calculated from the single-plane

VCG formed by the two leads:

• VCGamp: maximal amplitude of the VCG vector.

• VCGsin: sine component of the angle of the maximal

amplitude vector.

• VCGcos: cosine component of the angle of the maximal

amplitude vector.

Illustration of all morphological parameters is pre-

sented in Fig. 1b and c.

2.4. Classification methods

2.4.1. Kth nearest neighbour rule

This classification rule operates on vectors in a multi-

dimensional space [17,18]. In our case the heartbeats were

represented by 26-dimensional vector x = {Width, Pp1, Pn1,

Ar1, ArP1, ArN1, Av1, No1, Ima1, Imi1, S11, S21, Pp2, Pn2,

Ar2, ArP2, ArN2, Av2, No2, Ima2, Imi2, S12, S22, VCGam,

VCGsin, VCGcos}.

Initially, the reference set of the classification rule (called

learning set in our study) was formed to contain vectors of all

Page 4: Heartbeat Classification

7/28/2019 Heartbeat Classification

http://slidepdf.com/reader/full/heartbeat-classification 4/10

 I. Jekova et al. / Medical Engineering & Physics 30 (20 08) 248–257  251

analyzed heartbeat classes – N, PVC, LBBB, RBBB, and PB.

Each vector was previously labelled as belonging to one of 

the five classes by the accompanying database annotations.

During the first iteration, clusters were formed by group-

ing together those beats that belong to the same class at an

Euclidean distance from each other smaller than 10% of the

range of each data dimension. Thus, a number of separateclusters for each class were formed in the multidimensional

space. During the next iterations, every remaining vector,

independently of its class, was assigned to the cluster whose

centroid had the closest Euclidean distance. The class label

of each cluster was defined by the class of the constituent

vectors. If a cluster contained vectors belonging to differ-

ent classes, it was called mixed cluster. The mixed clusters

accepted the class label of the dominant vectors, if there was

a dominance of at least three elements. Otherwise the clus-

ter was not considered. The distances between the centroids

of all possible couples of clusters were checked. When two

clusters labelled with the same class had closer Euclidean

distance than 30% of the range of each data dimension, theywere merged. If theadjacentcouples were of differentclasses,

the smaller was removed from the cluster list and its vec-

tors became ‘free’ (unclustered). During this process, the free

vectors were assigned to the existing clusters with the above

described procedures. All these iterations continued until the

mean cluster vectors remained unchanged or the number of 

iterations exceeded a predefined value (200 iterations in our

case).

According to the K -nearest neighbours rule, a new vector

 x (belonging to an unknown class) was classified on the basis

of the nearest mean vector. The distance between vector x 

and the centroid of the jth cluster z j

was computed as theEuclidean distance:

d j =

 ni=1

(xi−norm − zj i )

2,

where j is the cluster index, i the parameter index, and n is

number of the parameters used. Vector x was classified to the

class of the cluster z j at which d  j had a minimum.

All parameters used in this equation must be previously

normalized with respect to the standard deviation, because

their different magnitudes or variances might cause unequal

weighting in the calculation of d  j

with inappropriate effects.

The normalization was defined by

xi−norm =xi 

(1/N r)N r

j =1(xj i − xi)

2

where N r is the number of all reference vectors and xi is the

mean value of the ith parameter.

2.4.2. Fuzzy logic

The method, which was applied for the classification of 

QRS complexes, used a partition of the feature space in five

classes with a fuzzy clustering and a consequent fuzzy clas-

sification [19–22]. In this way a Sugeno-type fuzzy inference

system whose membership function parameters were tuned

using a fuzzy clustering with the least squares type method

was developed. In particular, a dynamic algorithm performed

the partition using decreasing rate of the influence of cluster

center (from 50% of the width of the input data space from

10%). Bell shape (Gaussian) membership functions had beenused. Every QRS complex had a membership value in the

five classes: N, PVC, LBBB, RBBB or PB. The tested QRS

was assigned to the class, with the corresponding maximal

membership function.

2.4.3. Linear discriminant analysis

Linear discriminant analysis (DA) is a classical statistical

approach for classifying samples of unknown classes, based

on training samples with known classes [3]. It is assumed that

the features have normal distributions,althoughthe violations

of the normality assumption are not fatal as long as non-

normality is caused by skewness and not by outliers [23].

For each of the below described learning sets the 26

morphological descriptors were subjected to standard linear

discriminant analysis. Equal a priori classification prob-

abilities were used for all heartbeat classes. All cases,

which produced outliers in the parameters distributions,

were excluded from analysis. The outliers were defined as

data points, which are out of the range of the mean value

±3× standard error. Thus, five linear discriminant functions

of the n-dimensional vector x were generated by the software

package Statistica and were used to differentiate between N,

PVC, LBBB, RBBB and PB beats:

F j 

(x) =

w

i xi + aj 

Here j changes between 1 and 5, i changes between 1 and

the number of the parameters (26 in our case), x i the ith

parameter, and wj i and a j are the discriminant coefficients

and constants, which are calculated to maximize the distance

between the means of the different classification groups. In

thetesting phase, thefive discriminant functions( j = 1, . . .,5),

were calculated for the assessed heart beat and it was labeled

as corresponding to the class, which has maximal value of F  j.

2.4.4. Neural networks

From the 26 morphological descriptors, appropriate neu-ral network architecture was testedfor theclassification of the

five classes of QRS complexes [5–8,24–26]. Multiple feed-

forward neural networks were used and the back-propagation

learning phase with an approximation of a second-order algo-

rithm, based on Levenberg–Marquardt method, was applied

for improving the speed of convergence. One input layer ( X ),

one hidden layer with five nodes and one output layer with

five output nodes (corresponding to N, PVC, LBBB, RBBB

or PB) are used. The output vector (Y k ) can be expressed at

iteration k as

Y k = f (X,W k)

Page 5: Heartbeat Classification

7/28/2019 Heartbeat Classification

http://slidepdf.com/reader/full/heartbeat-classification 5/10

252 I. Jekova et al. / Medical Engineering & Physics 30 (2008) 248–257 

and the network weights are upgraded as follow:

W k+1 = W k − [J T J + μI ]−1J T e

where J  is the Jacobian matrix with first derivatives of 

the network errors (e) with respect to the weights and

μ is a dynamic parameter. A standard normalization pro-

cedure was performed on the input data and the mean

square error was used as error function. The NN train-

ing was terminated after a fixed number of iterations

(100).

The assessed heartbeat is labeled as belonging to the class

with the highest output component.

Fig. 2. Statistical assessment (mean value (·), ±standard error () and ±standard deviation (I)) of the morphological descriptors in groups defined by the

different heartbeat classes.

Page 6: Heartbeat Classification

7/28/2019 Heartbeat Classification

http://slidepdf.com/reader/full/heartbeat-classification 6/10

Table 2

Sensitivity (Se), specificity (Sp), positive predictive value (PPV) and negative predictive value (NPV) obtained with the different classification methods. Learningeach patient record; L2 local, 33.3(3)% randomly selected QRS complexes of each patient record

Knn FL DA NN

Se (%) Sp (%) PPV (%) NPV (%) Se (%) Sp (%) PPV (%) NPV (%) Se (%) Sp (%) PPV (%) NPV (%)

Learning set G

N 87.19 81.52 92.31 71.43 87.33 80.61 91.98 71.43 83.97 90.79 95.87 68.99

PVC 68.83 83.64 23.55 97.34 78.79 84.99 27.76 98.21 57.56 88.99 27.67 96.63

LBBB 18.84 95.31 25.15 93.35 24.76 94.03 25.77 93.72 27.42 89.33 17.70 93.63

PB 2.14 99.32 18.54 93.37 18.47 99.52 73.32 94.42 48.24 99.63 90.30 96.39

RBBB 43.23 97.11 52.70 95.82 27.87 98.47 57.54 94.82 31.54 94.43 29.69 94.87

Learning set B

N 94.80 95.31 98.09 87.83 94.33 85.56 94.33 85.56 88.84 94.93 97.81 76.97

PVC 88.80 94.63 54.79 99.14 85.30 95.62 58.78 98.89 70.96 97.63 68.65 97.87

LBBB 58.10 98.32 74.36 96.56 64.26 96.97 63.93 97.01 85.56 91.16 44.76 98.69 PB 74.50 99.23 78.49 98.18 65.16 99.65 93.10 97.54 80.52 99.81 96.78 98.61

RBBB 88.50 98.23 78.86 99.13 75.93 99.39 90.24 98.23 81.38 98.19 76.28 98.66

Learning set L1

N 98.40 98.00 99.19 96.05 97.01 98.36 99.33 92.95 99.67 98.92 99.57 99.17

PVC 90.73 98.77 84.83 99.29 91.96 97.70 75.20 99.38 94.81 99.76 96.80 99.60

LBBB 99.98 100 99.95 100 100 99.91 98.95 100 100 100 99.97 100

PB 100 99.93 99.02 100 99.82 99.92 98.96 99.99 99.98 99.93 99.11 100

RBBB 100 99.98 99.70 100 99.12 99.99 99.84 99.94 99.92 99.98 99.74 100

Learning set L2

N 99.91 99.54 99.82 99.76 99.85 99.37 99.75 99.62 96.64 98.46 99.37 92.01

PVC 97.76 99.92 98.96 99.84 97.40 99.89 98.47 99.81 96.94 97.27 72.29 99.77

LBBB 99.98 99.98 99.81 100 99.91 99.97 99.69 99.99 99.98 99.98 99.81 100

PB 99.98 100 99.96 100 99.96 99.99 99.91 100 95.92 99.99 99.91 99.71

RBBB 99.96 99.99 99.90 100 99.65 99.99 99.88 99.97 98.45 100 99.94 99.88

Page 7: Heartbeat Classification

7/28/2019 Heartbeat Classification

http://slidepdf.com/reader/full/heartbeat-classification 7/10

254 I. Jekova et al. / Medical Engineering & Physics 30 (2008) 248–257 

2.5. Learning sets

The wide variations in the heartbeat waveforms among

the individual patients make the selection of the learning set

a critical choice. For this fact, two main training strategies are

considered: global learning set, independent from the tested

patient and built from a large ECG database, and the local

learning set, in case it is customized to the tested patient.

Consequently, in order to study the effects and the influ-

ence of the content and the size of the learning set on the

classification accuracy, we formed one global learning set,

and three local learning sets, defined as follows:

Fig. 3. Statistical indices (Se, Sp) estimated with the Knn, FL, DA and NN classification methods for the defined five heartbeat classes and four different

learning sets.

Page 8: Heartbeat Classification

7/28/2019 Heartbeat Classification

http://slidepdf.com/reader/full/heartbeat-classification 8/10

 I. Jekova et al. / Medical Engineering & Physics 30 (20 08) 248–257  255

(1) Global learning set (G): it contains five types of QRS

complexes collected from all patients in the MIT-BIH

database. From each record we selected 1 QRS complex

of each available type in the record. A total of 91 QRS

complexes were collected as a basic learning dataset. In

the training process, the leave one out rule (LOO) was

applied in order to achieve an independent learning fromthe tested patient (global learning). This means that in

the classification process of one ECG recording, only

heartbeats in G from the remaining 47 ECG recordings

were used.

(2) Basic learning set (B): it contains five types of QRS

complexes collected from all patients in the MIT-BIH

database. From each record we selected 1 QRS complex

of each available type in the record. Thus, 91 QRS com-

plexes were used as a basic learning dataset (it is similar

to G, but with no use of LOO rule).

(3) Local learning set (L1): it consists of the first 30% of the

selected record.

(4) Local learning set (L2): it consists of 33.3(3)% randomlyselected QRScomplexes from therecord. We shouldnote

that the heart beats were randomly selected once and

further used for training of all the four methods.

G may be considered in the strategy of global learning

sets, while the remaining sets B, L1 and L2 are local learning

sets.

3. Results

The statistical distributions of the morphological descrip-

tors for the different heartbeat classes are represented in

Fig. 2. In terms of space-saving, the parameters measured

for the two ECG leads are shown only for ECG Lead1.

The classification abilities of the four classification meth-

ods, applied on the morphological ECG descriptors are

estimated for the four learning sets described above by pro-

cessing all heartbeats from MIT-BIH arrhythmia database,

which were not used for learning and belong to the selected

heartbeat classes. Four statistical indices were calculated for

each heartbeat class i (i = N, PVC, LBBB, PB, RBBB): sensi-

tivity (Sei), specificity (Spi), positive predictive value (PPVi)

and negative predictive value (NPVi). They are computed

according to the following relations [27]:

Spi =TNi

TNi + FPi

, PPVi =TPi

TPi + FPi

,

NPVi =TNi

TNi + FNi

, Sei =TPi

TPi + FNi

TPi (true positives) is the number of heartbeats of the ith

class, which are correctly classified (e.g. N classified as N);

TNi (true negatives) is the number of heartbeats not belong-

ing to the ith class and not classified in the ith class (e.g.

PVC, LBBB, RBBB and PB not classified as N); FPi (false

positives) is the number of heartbeats classified erroneously

in the ith class (e.g. PVC, LBBB, RBBB and PB classified as

N); FNi (false negatives) is the number of heartbeats of class

i, classified in a different class (e.g. N not classified as N).

The results are listed in Table 2 and their graphical repre-

sentation is illustrated in Fig. 3.

4. Discussion and conclusion

The performances of Knn, FL, DA and NN, which were

applied for heartbeat classification in five classes, depend

on the used learning set. Fig. 3 illustrates the graphical

representations of the sensitivities and specificities for all

learning sets. The results obtained by training all methods

with the global learning set (G) were unreliable, especially

for PB processed with Knn (Se = 2.14%) and for LBBB

and RBBB processes with NN (Se = 5.13% and Se = 8.38%,

respectively). However, the low accuracies for these three

heartbeat types were prospective, since the MIT-BIH arrhyth-

mia database contains only four patients with PB, fourpatients with LBBB and five patients with RBBB, which are

quite insufficient for collection of a reliable global learning

set. All methods showed limited results with B, due to its

small size, and significantly better accuracies with L1 and

L2. The disadvantage of L1 and L2 is that they do not con-

tain all analyzed heartbeat types but only these, which are

typical for the ECG recording of the respective patient. Thus,

the classification methods do not have the possibility to rec-

ognize correctly heartbeats, which appear accidentally and

such ventricular complexes are fated to be classified wrongly

by default when using L1 or L2. The performances of Knn,

NN and FL methods are the best with L2. Generally the betterlearning capacity of L2 is dueto thefact that they comprise the

variations of morphology waveforms of the QRS complexes,

which appear in the ECG recording of one patient. Such kind

of learning sets may be obtained by random manual edit-

ing of the ECG recording. Since the DA method achieved

the best accuracy with the L1 learning set, it can be specu-

lated that the variations in the morphology parameters lead

to unstable discriminant functions and lower classification

accuracies.

Although a lot of studies dealing with heartbeat classifi-

cation are present in the literature, a strict comparison with

the results of the present work is difficult to perform, since

different heartbeat categories were used and different ECG

datasets were considered. Several studies are concerning

the discrimination between premature ventricular contrac-

tion (PVC) and Normal beats (N). For example Hamand Han

[9] obtained sensitivity and specificity in the classification

of PVC of 97% and 99% considering 6 MIT-BIH records,

whereas Moraes et al. [11] obtained 90.7 and 96.6% with 44

MIT-BIH records. However, aiming to set a bench mark for

the fiveheartbeat types considered in this study, we compared

our results with the clustering ability reported by Lagerholm

et al. [10], where they processed all 48 MIT-BIH recordings

for clustering ECG complexes using Hermit functions and

Page 9: Heartbeat Classification

7/28/2019 Heartbeat Classification

http://slidepdf.com/reader/full/heartbeat-classification 9/10

256 I. Jekova et al. / Medical Engineering & Physics 30 (2008) 248–257 

Table 3

Sensitivity (Se) and specificity (Sp) obtained by the four analysed classification methods (Knn, FL and NN with the local learning set L2 and DA with the local

learning set L1) compared to the statistical indices achieved by Lagerholm et al. [10]

Beat type Lagerholm et al. [10] Knn with L2 FL with L2 DA with L1 NN with L2

Se (%) Sp (%) Se (%) Sp (%) Se (%) Sp (%) Se (%) Sp (%) Se (%) Sp (%)

N 99.79 99 99.91 99.54 99.85 99.37 99.67 98.92 99.88 99.57

PVC 95.79 97.86 97.76 99.92 97.40 99.89 94.81 99.76 97.99 99.90LBBB 99.83 98.39 99.98 99.98 99.91 99.97 100 100 99.93 99.98

PB 98.97 97.69 99.98 100 99.96 99.99 99.98 99.93 99.96 100

RBBB 98.01 98.66 99.96 99.99 99.65 99.99 99.92 99.98 99.92 100

self-organizing maps. Some differences are present in the two

studies:

• In [10] a proper QRS detector was used, excluding from

analysis about 0.3% of the N, RBBB and PVC beats and

0.2% of the LBBB and PB beats while in the present study

the annotation of the MIT-BIH database was used, and all

heartbeats were analyzed.• In [10] all the ECG recordings were clustered in 15 classes

based on the dominant class, while in the present study a

learning setand a testing setwere used forthe classification

of 5 classes.

Although the problem of clustering is different from the

classification process, we computed the corresponding sen-

sitivity and specificity indices for the five considered classes

considering the clustering table in [10]. The results are

reported in Table 3, in comparison to the best results obtained

by the Knn, FL, DA and NN methods with the different

learning sets.

The performances of Knn, NN and FL trained with L2

for PVC, LBBB, RBBB and PB are better than the results

in [10] and the accuracies for N beats achieved in both stud-

ies are comparable. The Se value for PVC beats achieved

with DA (trained with L1) is lower than the reported in [10]

but it is balanced with higher Sp value. This may be due to

the large variety of bizarre PVC waveforms, which results

in relatively wide and skewed distributions of some of the

morphological parameters (see Fig. 2). For all other heart-

beat types the DA assures reliable results, comparable to or

higher than the reported in [10].

Theresults of the performed study could be used for selec-

tion of both the most suitable classification method and fordetermining the best strategy for learning set collection, tak-

ing into account the predefined requirements for heartbeat

classification.

Acknowledgments

This study was supported by a joint project between

the Italian National Research Council (ISIB-CNR) and the

Bulgarian Academy of Sciences (Center of Biomedical

Engineering-BAS).

References

[1] Christov I, Gomez-Herrero G, Krasteva V, Jekova I, Gotchev

A, Egiazarian K. Comparative study of morphological and time-

frequency ECG descriptors for heartbeat classification. Med Eng Phys

2006;28:876–87.

[2] Bortolan G, Jekova I, Christov I. Comparison of four methods for pre-

mature ventricular contractions and normal beats clustering. Comput

Cardiol 2005;32:921–4.

[3] Chazal P, O’Dwyer M, Reilly RB. Automatic classification of heart-

beats using ECG morphology and heartbeat interval features. IEEE

Trans Biomed Eng 2004;51:1196–206.

[4] Jekova I, Bortolan G, Christov I. Pattern recognition and optimal

parameter selection in premature ventricular contraction classification.

Comput Cardiology 2004;31:357–60.

[5] Christov I, Bortolan G. Ranking of pattern recognition parameters for

premature ventricular contraction classification by neural networks.

Physiol Meas 2004;25:1281–90.

[6] Gomez-Herrero G, GotchevA, ChristovI, EgiazarianK. Heartbeatclas-

sificationusing independent component analysisand matchingpursuits.

In: IEEE Int. Conf. Acoustics, Speech and Signal Processing, vol. 4.

Philadelphia, USA: ICASSP; 2005. p. 725–8.

[7] Al-Nashash H. Cardiacarrhythmia classificationusingneuralnetworks.

Technol Health Care 2000;8:363–72.[8] Maglaveras N, Stamkopoulos T,Diamantaras K, Pappas C, Strintzis M.

ECGpatternrecognitionand classificationusingnon-linear transforma-

tions andneuralnetworks: a review. IntJ MedInform1998;52:191–208.

[9] Ham FM, Han S. Classification of cardiac arrhythmias using fuzzy

ARTMAP. IEEE Trans Biomed Eng 1996;43(4):425–30.

[10] Lagerholm M, Peterson G, Braccini G, Edenbrandt L, Sornmo L. Clus-

tering ECGcomplexusingHermitefunctions andself-organizingmaps.

IEEE Trans Biomed Eng 2000;47(7):838–48.

[11] Moreas JCTB, Seixas MO, Vilani FN, Costa EV. A real time QRS

complex classification method using Mahalanobis distance. Comput

Cardiol 2002;29:201–4.

[12] Christov I, Jekova I, Bortolan G. Premature ventricular contrac-

tion classification by the K th nearest neighbours rule. Physiol Meas

2005;26:123–30.

[13] Hu YH,PalreddyS, TompkinsWJ. A patient-adaptable ECGbeat clas-sifier using a mixture of experts approach. IEEE Trans Biomed Eng

1997;44(9):891–900.

[14] Mark R, Moody G. MIT-BIH Arrhythmia database directory. Cam-

bridge: Massachusetts Institute of Technology; 1988.

[15] Daskalov IK, Dotsinsky IA, Christov I. Developments in ECG acquisi-

tion, preprocessing, parameter measurement and recording. IEEE Eng

Med Biol 1998;17:50–8.

[16] Daskalov IK, Christov I. Electrocardiogramsignal preprocessing

for automatic detection of QRS boundaries. Med Eng Phys

1999;21(1):37–44.

[17] Fukunaga K. Introduction to statistical pattern recognition. New York:

Academic Press; 1972.

[18] Kissiov VT, Hadjitodorov ST. A fuzzy version of the K –NN method.

Fuzzy Sets Syst 1992;49:323–9.

Page 10: Heartbeat Classification

7/28/2019 Heartbeat Classification

http://slidepdf.com/reader/full/heartbeat-classification 10/10

 I. Jekova et al. / Medical Engineering & Physics 30 (20 08) 248–257  257

[19] Pedrycz W, Bortolan G, Degani R. Classification of electrocardio-

graphic signals: a fuzzy pattern matching approach. Artif Intell Med

1991;3:31–46.

[20] Jang JSR. ANFIS: adaptive-network-based fuzzy inference systems.

IEEE Trans Syst Man Cybern 1993;23(3):665–85.

[21] Chiu S. Fuzzy model identification based on cluster estimation. J Intell

Fuzzy Syst 1994;2(3):267–8.

[22] Sugeno M. Industrial applications of fuzzy control. North Holland:

Elsevier Science Publishers; 1985.

[23] Tabachnick BG, Fidell LS. Using multivariate statistics. New York:

Harper Collins College Publishers; 1996.

[24] XueQ, Hu YH,TompkinsWJ. Neural-network-based adaptive matched

filtering for QRS detection. IEEE Trans Biomed Eng 1992;39:

317–29.

[25] Yang TF, Devine B, Macfarlane PW. Artificial neural networks for

the diagnosis of atrial fibrillation. Med Biol Eng Comput 1994;32:

615–9.

[26] SilipoR, Bortolan G, Marchesi C. Designof hybrid architectures based

on neural classifier and RBF pre-processing for ECG analysis. I. J

Approx Reason 1999;21:177–96.

[27] Michaelis J, Wellek S, Willems JL. Reference standards for software

evaluation. Methods Inform Med 1990;29:289–97.