Upload
archit-singhal
View
248
Download
0
Embed Size (px)
Citation preview
8/6/2019 Archit Seminar Report
1/32
CHAPTER-1
1 Introduction
The progress in desktop and portable computing in the past decade has provided the means
with the PC or customized microcomputer-based instrumentation to develop solutions to
biomedical problems that could not be approached before. One of our personal interests has
been the design portable instruments that are light, compact, and battery powered. A typical
instruments of this type is truly a personal since it is programmed to monitor signals from
transducers or electrodes mounted on the person who is carrying it around.
1.1 Portable Microcomputer-Based Instruments
One example of a portable device is the portable arrhythmia monitor which monitors a
patients electrocardiogram from chest electrodes and analyzers it in real time to determine ifthere are any heart rhythm abnormalities. We designed a prototype of such a device more
than a decade ago. Because of the technology available of the time, this device was primitive
compared with modern commercially portable arrhythmia monitors. The evolution of the
technology also permits us to think of even more extensions that we can make. Instead of just
assigning a heart monitoring device to follow a patient after discharge from the hospital, we
can now think of designing a device that would help diagnose the heart abnormality when the
patient arrives in the emergency room. With a careful design, the same device might go with
the patient to monitor the cardiac problem during surgery in the operating room, continuously
learning the unique characteristics of the patients heart rhythms. The device could follow the
patients throughout the hospital stay, alerting the hospital staff to possible problems in theintensive care unit, in the regular hospital room, and even in the hallways as the patient walks
to the cafeteria. The device could them accompany the patient home, providing continuous
monitoring that is not now practical to do, during the critical times following open heart
surgery.
There are many other examples of portable biomedical instruments in the marketplace and in
the research lab. One other microcomputers-based device that we contributed to developing is
a calculator size product called the CALTRAC that uses a miniature accelerometer to monitor
the motion of the body. It then converts this activity measurement to the equivalent number
of calories and display the cumulative result on an LCD display.
1.2 PC-Based Medical instrumentsThe economy of mass production has led tothe use of the desktop PC as the central computer for many types of biomedical application.
Many companies use PCs for such application as sampling and analysing physiological
signals, maintaining equipment databases in the clinical engineering department of hospitals,
and simulation and modeling of physiological system.
CHAPTER-2
8/6/2019 Archit Seminar Report
2/32
2Basic Electrocardiography
The electrocardiogram (ECG) is a graphicalrepresentation of the electrical activity of the
heart and isobtained by connecting specially designed electrodes to thesurface of the body
[1]. It has been in use as a diagnostic toolfor over a century and is extremely helpful in
identifyingcardiac disorders non-invasively. The detection of cardiacdiseases using ECG has benefited from the advent of thecomputer and algorithms for machine identification of
thesecardiac disorders.
Fig. 2.1 The placement of the bipolar leads
A new dimension byintroducing the concept of vectors to represent the ECGvoltages. He is
also the first individual to standardize theelectrode locations for collecting ECG signals as
right arm(RA), left arm (LA) and left leg (LL), and these locations areknown after him as the
standard leads of Einthoven or limbleads, as shown in Figure 1. The limb leads consist of
sixunipolar chest leads, starting from lead V1 until V6 in anelectrocardiogram.
Fig. 2.2 The placement of the exploratory electrode for theunipolar chest leads
Most of the cardiac disease classification algorithms beginwith the separation or delineationof the individual ECG signalcomponents. The ECG signal comprises of the QRS complex,Pand T waves as shown in Figure 3. Occasionally a U-wavemay also be present which lies
8/6/2019 Archit Seminar Report
3/32
8/6/2019 Archit Seminar Report
4/32
ofelectronics support and processing unit within the mobilephone, the overall performance ishardly operated in an idealcondition. The display screen of mobile phone is smaller than.
An ECG signal, according to the American Heart Association, must consist of 3 individual
leads, each recording 10 bits per sample, and 500 samples per second. Some ECG signals,may require 12 leads, 11 bits per second, 1000 samples per second, and last 24 hours. When
converted to a digital format, this single ECG record requires a total of 1.36 gigabytes ofcomputer storage! Considering the 10 million ECGs annually recorded for the purposes ofcomparison and analysis in the United States alone, the necessity for effective ECG datacompression techniques is becoming increasingly important. Further more, the growing needfor transmission of ECGs for remote analysis is hindered by capacity of the average analogtelephone line and mobile radio .
2.2 Percent Mean-Square Difference (PRD)
Data compression techniques are categorized as those in which the compressed data isreconstructed to form the original signal and techniques in which higher compression ratios
can be achieved by introducing some error in the reconstructed signal. The effectiveness ofan ECG compression technique is described in terms of compression ratio (CR), a ratio of thesize of the compressed data to the original data; execution time, the computer processing timerequired for compression and reconstruction of ECG data; and a measure of error loss, oftenmeasured as the percent mean-square difference (PRD). The PRD is calculated as follows:
where ORG is the original signal and REC is the reconstructed signal. The lower the PRD,
the closer the reconstructed signal is to the original ECG data.
There are several exact-reconstruction techniques including null suppression, run-length
encoding, diatomic encoding, pattern substitution, differencing, facsimile techniques, and
statistical encoding. Null suppression is a data-compression technique that searches for
strings of empty, or null, data points and replaces them with an indicator and a number
representing the length of the null sequence. Run-length encoding is a modification of null
suppression, where the process used to compress null data points is applied to any repeatingdata sequence. For instance a character sequence:
WGAQQQQQQRBCCCCCHZY
may be compressed as:WGAQ*6RBC*5HZY,
8/6/2019 Archit Seminar Report
5/32
8/6/2019 Archit Seminar Report
6/32
average beat subtraction techniques are commonly applied to ECG records. More complex
techniques such as the use of wavelet packets, neural networks, and adaptive Fourier
coefficients are currently being explored with the expectations that they will result in higher
compression ratios but longer processing time for compression and reconstruction.
Figure 2.2 ZOP floating aperture.
Polynomial predictors are a data compression technique in which a polynomial of orderkis
fit through previously known data samples and then extrapolated to predict future ones. The
polynomial predictor is in the form:
y'n = yn-1 + Dyn-1 + D2yn-1 + ... + D
kyn-1
where
y'n = predicted sample point at time tn
yn-1 = sample value at one sample period prior to tn
Dyn-1 = yn-1 - yn-2
Dkyn-1 = D
k-1yn-1 - D
k-1yn-2.
The polynomial predictor with k= 0 is known as the Zero-Order Predictor (ZOP). With the
ZOP, each data point is predicted to be the same as the previous one, resulting in the
equation:
y'n = yn-1
Most applications using the ZOP, use a form known as a floating aperture, or step method.
This algorithm records a single data sample and deletes each successive sample that lies
within a tolerance band e, whose center lies on the saved sample, and replaces them with a
horizontal line. When a sample is outside the tolerance band, the new sample is saved and the
process continues.
8/6/2019 Archit Seminar Report
7/32
Figure 2.3 the FOP.
The first-order predictor (FOP) is the use of a polynomial predictor with k= 1. It requires the
knowledge of the two previous points, yielding the equation:
y'n = 2yn-1 + yn-2
Similar to the ZOP, its difference lies in the way the predicting line is drawn. Rather than a
horizontal line, the two previous points are used to formulate a starting point and a slope for
the prediction. A tolerance band is still applied, and when a data sample lies outside the
tolerance band, a new FOP. is formulated .
Polynomial interpolators differ from polynomial predictors in that previous and future data
samples are used to create the predicting polynomial. The zero-order interpolator (ZOI)
modifies the ZOP by allowing the horizontal line to have a height corresponding to the
average of a set of data rather than simply that of the first point. Although both the ZOI and
the ZOP ensure that all data samples lie within a tolerance band around the reconstructed
data, the ZOI results in higher compression since the saved data points are better chosen.
Figure2.4 the ZOI.
The first-order interpolator (FOI) is similar to the FOP in that two points are used toformulate a line and a slope, but it uses the idea of the ZOI in choosing the two data samples
that will optimize the compression ratio of the data. The FOP with two degrees of freedom
(FOI-2DF), also known as the two point projection method, is often found to be the most
successful of the FOI's. It works by recording two successive data samples and creating a line
connecting the points. If altering the line to pass through the third data point rather than the
second still retains the second data point within a tolerance bound around the line, a new line
is drawn through from the first point to the fourth rather than to the third. This continues to
8/6/2019 Archit Seminar Report
8/32
theKth sample, when data points between the first and theKth lie outside the tolerance bound
around the line. When this occurs, the line connecting the first and (K-1)th sample is assumed
to be the best approximation and is saved. However, in practice, the information that is
actually saved is simply a height and a distance to the next saved sample. When
decompressed, the line is drawn through the saved height and the next saved height, using
this new height as the starting point of the next line.
The AZTEC, TP, and CORTES ECG compression techniques are becoming popular
schemes in ECG data compression. The amplitude zone time epoch coding algorithm
(AZTEC) converts the original ECG data into horizontal lines (plateaus) and slopes. Plateaus
use a ZOI algorithm to compress the data, where an amplitude and a length are saved. Slopes
are formed when the length of a plateau is less than three. The information saved from a slope
is the length of the slope and its final amplitude. Although the AZTEC algorithm creates a
compression ratio of around 10:1, the step-like reconstruction of the ECG signal is
unacceptable for accurate analysis by the cardiologist, especially in the P and T portions of
the ECG cycle.
Figure2.5 the FOI-2DF.
The turning point technique (TP) always produces a 2:1 compression ratio and retains the
important features of the ECG signal. It accomplishes this by replacing every three data
points with the two that best represent the slope of the original three points. The second of the
two saved points is used in the calculation of the next two points. The coordinate reduction
time encoding system (CORTES) combines the high compression ratios of the AZTEC
system and the high accuracy of the TP algorithm. The slopes of the AZTEC technique,
which occur mostly in the QRS complex, are replaced by TP compression, and long plateaus
that occur in an ECG signal are compressed using AZTEC compression. The result is
compression ratios only slightly lower than AZTEC, but much less error in the reconstructed
signal.
Finally, average beat subtraction has been applied with success to ECG data. In this
algorithm, it is assumed that the locations of the heart beats are known and the original ECG
signal has been segmented into vectors, each corresponding to a beat. Since beats vary in
length, the longest beat will be used to determine the length of each vector. The additional
vector elements in shorter beats are then filled by the use of various methods. Average beat
8/6/2019 Archit Seminar Report
9/32
subtraction then uses the assumption that each beat will have a similar pattern and averages
the vectors to create a new beat that should be very similar to all the original vectors. Once
this vector is saved, it is subtracted from each of the other vectors. The remaining data,
known as the residuals, is then compressed using one of the previously mentioned techniques.
Thus, the purpose of this investigation is twofold. First, several existing and originallydesigned ECG data compression techniques will be compared for compression ratios,
execution times, and data loss. Second, a computer program will be designed that will
incorporate these data compression techniques in user-friendly software that will enable the
operator to easily compress and reconstruct ECG data through a simple, graphic interface.
2.3Experimental Design
Several ECG compression techniques will be written in the PASCAL computer language and
tested on single-lead ECG records stored on the MIT-BIH Arrhythmia Database. The
database is a Massachusetts Institute of Technology created CD-ROM collection of ninety
two-hour ECG recordings with beats, rhythms, and signal quality annotations. The source of
the ECGs is a set of over 4000 long-term Holter recordings from the Beth Israel Hospital
Laboratory over a period from 1975 through 1979. The subjects who produced the ECG
signals are from a group of 25 men aged 32 to 89 years and 22 women aged 23 to 89 years.
The original analog recordings of the ECG signals are performed using nine Del Mar
Avionics model 445 two-channel recorders.
The digitization is done while playing back the analog data on a Del Mar Avionics model
660 unit. The digitization creates filtered signals at 360 Hz and an 11-bit resolution over a 5
mV range. Of importance to data compression is that this method of digitization results in
360 samples per second, with each sample corresponding to an integer ranging from 0 to2047 inclusive. Besides the normal noise attributed to ECG recordings, such as muscle
contraction and electrical interference, the process of recording and digitizing creates
additional noise at frequencies of 0.042 Hz, 0.083 Hz, 0.090 Hz, and 0.167 Hz. Further more,
the take-up and supply reels of the analog recordings produce a 0.18 Hz 0.10 Hz and a 0.20
Hz 0.36 Hz noise, respectively.
Following collection of the raw ECG data, the computer program that will be written will
utilize existing compression techniques to filter noise and compress the data. Original
compression techniques will modify and combine existing techniques in hope that higher
compression ratios and less error will be achieved. In addition, a Macintosh computerinterface will be created so that a cardiologist will be able to graphically compare the original
ECG signal to a sample result of each of several compression techniques. This will allow the
operator to choose which algorithm best obtains the highest compression ratio while retaining
the needed ECG information. The interface will allow the user to complete this task without
experienced knowledge of the mathematical details of the compression technique.
8/6/2019 Archit Seminar Report
10/32
Expected Results
Figure2.6 Sample image from the ECGCompression program illustrating the original
ECG signal (top) and the reconstructed signal (bottom), compressed using the FOI-2DF
algorithm.
Preliminary results include a development version of the ECGCompression program and
tests of the FOI-2DF, average beat subtraction, and first-order differencing. Use of these
techniques have resulted in compression ratios of 10:1 and excellent reconstruction in the
best cases. This is achieved by using a combination of the three algorithms. Use of the
AZTEC, TP, Cortes, and Fourier coefficients algorithms will hopefully increase the
compression ratios. A variation of one or more of these techniques will likely produce an
optimal algorithm, and the created program will be used to verify the results.
2.4Relevancy
Compression techniques have been around for many years. However, there is still a continual
need for the advancement of algorithms adapted for ECG data compression. The necessity of
better ECG data compression methods is even greater today than just a few years ago for
several reasons. The quantity of ECG records is increasing by the millions each year, and
previous records cannot be deleted since one of the most important uses of ECG data is in the
comparison of records obtained over a long range period of time. The ECG data compression
techniques are limited to the amount of time required for compression and reconstruction, the
noise embedded in the raw ECG signal, and the need for accurate reconstruction of the P, Q,
R, S, and T waves. The results of this research will likely provide an improvement on
existing compression techniques, and the original computer program will provide a simple
interface so that the cardiologist can use ECG data compression techniques without
knowledge of the specific details and mathematics behind the algorithms.
8/6/2019 Archit Seminar Report
11/32
CHAPTER-3
3.1Data Reduction Techniques
A typical computerized medical signal processing system acquires a large amount of data that
is difficult to store and transmit. We need a way to reduce the data storage space whilepreserving the significant clinical content for signal reconstruction. In some applications, the
process of reduction and reconstruction requires real-time performance.
A data reduction algorithm seeks to minimize the number of code bits stored by reducing theredundancy present in the original signal. We obtain the reduction ratio by dividing thenumber of bits of the original signal by the number saved in the compressed signal. Wegenerally desire a high reduction ratio but caution against using this parameter as the sole
basis of comparison among data reduction algorithms. Factors such as bandwidth, samplingfrequency, and precision of the original data generally have considerable effect on thereduction ratio.A data reduction algorithm must also represent the data with acceptable fidelity. In
biomedical data reduction, we usually determine the clinical acceptability of thereconstructed signal through visual inspection. We may also measure the residual,that is, thedifference between the reconstructed signal and the original signal.Such a numerical measureis the percentroot-mean-square difference, PRD, given by
wheren is the number of samples and xorgand xrecare samples of the originalandreconstructed data sequences.A lossless data reduction algorithm produces zero residual, and the reconstructedsignalexactly replicates the original signal. However, clinically acceptable qualityis neither
guaranteed by a low nonzero residual nor ruled out by a high numerical residual.For example, a data reduction algorithm for an ECGrecording may eliminate small-
amplitude baseline drift. In this case, the residualcontains negligible clinical information. The
reconstructed ECG signal can thus bequite clinically acceptable despite a high residual.In this chapter we discuss two classes of data reduction techniques for the ECG.The firstclass, significant-point-extraction, includes the turning point (TP)algorithm, AZTEC(Amplitude Zone Time Epoch Coding), and the Fan algorithm.These techniques generallyretain samples that contain important information aboutthe signal and discard the rest. Sincethey produce nonzero residuals, they are lossyalgorithms. In the second class of techniques
based on Huffman coding, variablelengthcode words are assigned to a given quantized datasequence according tofrequency of occurrence. A predictive algorithm is normally usedtogether withHuffman coding to further reduce data redundancy by examining a successive
8/6/2019 Archit Seminar Report
12/32
number of neighboring samples.
3.2Turning Point Algorithm
The original motivation for the turning point (TP) algorithm was to reduce thesamplingfrequency of an ECG signal from 200 to 100 samples/s .The algorithm developed from theobservation that, except for QRS complexeswith large amplitudes and slopes, a sampling rateof 100 samples/s is adequate.TP is based on the concept that ECG signals are normally oversampled at four orfive timesfaster than the highest frequency present. For example, an ECG used inmonitoring may havea bandwidth of 50 Hz and be sampled at 200 sps in order toeasily visualize the higherfrequency attributes of the QRS complex. Sampling theorytells us that we can sample such asignal at 100 sps. TP provides a way to reducethe effective sampling rate by half to 100 sps
by selectively saving importantsignal points (i.e., the peaks and valleys or turning points).The algorithm processes three data points at a time. It stores the first samplepoint and assignsit as the reference point X0. The next two consecutive points becomeX1 and X2. Thealgorithm retains eitherX1 orX2, depending on which pointpreserves the turning point (i.e.,
slope change) of the original signal.Figure 10.1(a) shows all the possible configurations of three consecutive samplepoints. Ineach frame, the solid point preserves the slope of the original threepoints. The algorithmsaves this point and makes it the reference point X0 for thenext iteration. It then samples thenext two points, assigns them to X1 and X2, andrepeats the process.We use a simple mathematical criterion to determine the saved point. Firstconsider a sign(x)operation
8/6/2019 Archit Seminar Report
13/32
We then obtains1 =sign(X1 X0) ands2 =sign(X2 X1), where (X1 X0) and(X2X1) arethe slopes of the two pairs of consecutive points. If a slope is zero,this operator produces azero result. For positive or negative slopes, it yields +1 or1 respectively. A turning pointoccurs only when a slope changes from positive tonegative or vice versa.We use the logical Boolean operators, NOT and OR, as implemented in the Clanguage tomake the final judgment of when a turning point occurs. In the Clanguage, NOT(c) = 1 ifc =
0; otherwise NOT(c) = 0. Also logical OR means that(a ORb ) = 0 only ifa and b are both 0.Thus, we retain X1 only if {NOT(s1) OR(s1 + s2)} is zero, and save X2 otherwise. In this
expression, (s1 + s2) is the arithmeticsum of the signs produced by the sign function. The
final effect of this processingis a Boolean decision whether to saveX
1 orX
2. PointX
1 issaved onlywhen the slope changes from positive to negative or vice versa. Thiscomputationcould be easily done arithmetically, but the Boolean operation is
computationallymuch faster.
8/6/2019 Archit Seminar Report
14/32
Figure 3.2An example of the application of the TP algorithm. (a) Original waveform
generated by the UW DigiScopeGenwavefunction (see Appendix D). (b) Reconstructed
signal after one application of the TP algorithm. Reduction ratio is 512:256, PRD =
7.78%.
8/6/2019 Archit Seminar Report
15/32
The TP algorithm is simple and fast, producing a fixed reduction ratio of 2:1.After selectivelydiscarding exactly half the sampled data, we can restore the originalresolution by
interpolating between pairs of saved data points.A second application of the algorithm to the already reduced data increases thereduction ratio
to 4:1. Using data acquired at a 200-sps rate, this produces compresseddata with a 50-spseffective sampling rate. If the bandwidth of the acquiredECG is 50 Hz, this approach violates
sampling theory since the effective samplingrate is less than twice the highest frequencypresent in the signal. The resulting reconstructedsignal typically has a widened QRS complexand sharp edges that reduceits clinical acceptability. Another disadvantage of this algorithmis that thesaved points do not represent equally spaced time intervals. This introducesshorttermtime distortion. However, this localized distortion is not visible when thereconstructedsignal is viewed on the standard clinical monitors and paper recorders.
3.3Aztec Algorithm
Originally developed to preprocess ECGs for rhythm analysis, the AZTEC(Amplitude Zone
Time Epoch Coding) data reduction algorithm decomposes rawECG sample points into
plateaus and slopes. It provides a sequenceof line segments that form a piecewise-linear
approximation to the ECG.
3.4Data Reduction
Figure 3.2 shows the complete flowchart for the AZTEC algorithm using C-languagenotation. The algorithm consists of two partsline detection and line processing.Figure 3.2(a) shows the line detection operation which makes use of zero-orderinterpolation(ZOI) to produce horizontal lines. Two variables Vmxand Vmnalwaysreflect the highest andlowest elevations of the current line. VariableLineLenkeepstrack of the number of samplesexamined. We store a plateau if either the differencebetween Vmxiand Vmniis greater than a
predetermined threshold Vthor ifLineLenis greater than 50. The stored values are the length(LineLen 1) and theaverage amplitude of the plateau (Vmx+ Vmn)/2.Figure 10.4(b) shows the line processing algorithm which either produces aplateau or a slope
depending on the value of the variable LineMode. We initializeLineModeto _PLATEAU inorder to begin by producing a plateau. The productionof an AZTEC slope begins when thenumber of samples needed to form a plateauis less than three. SettingLineModeto _SLOPEindicates that we have enteredslope production mode. We then determine the direction orsignof the currentslope by subtracting the previous line amplitude V1 from the current amplitudeVsi.We also reset the length of the slope Tsi. The variable Vsirecords the current lineamplitude so
that any change in the direction of the slope can be tracked. Note thatVmxiand Vmniare
always updated to the latest sample before line detection begins.This forces ZOI to beginfrom the value of the latest sample.
8/6/2019 Archit Seminar Report
16/32
Figure 3.3(a) Flowchart for the line detection operation of the AZTEC algorithm.
8/6/2019 Archit Seminar Report
17/32
Figure 3.3(b) Flowchart of the line processing operation of the AZTEC algorithm.
When we reenter line processing withLineModeequal to _SLOPE, we eithersave or updatethe slope. The slope is saved either when a plateau of more thanthree samples can be formed
8/6/2019 Archit Seminar Report
18/32
or when a change in direction is detected. If we detecta new plateau of more than threesamples, we store the current slope and the newplateau. For the slope, the stored values are its
length Tsiand its final elevation V1. Note that Tsiis multiplied by 1 to differentiate a slope from a plateau (i.e., theminus sign
serves as a flag to indicate a slope). We also store the length and theamplitude of the newplateau, then reset all parameters and return to plateau production.
If a change in direction is detected in the slope, we first save the parameters forthe currentslope and then resetsign, Vsi, Tsi, Vmxi, and Vmnito produce a newAZTEC slope. Now thealgorithm returns to line detection but remains in slopeproduction mode. When there is nonew plateau or change of direction, we simplyupdate the slopes parameters, Tsiand Vsi, andreturn to line detection withLineModeremaining set to _SLOPE.AZTEC does not produce a constant data reduction ratio. The ratio is frequentlyas great as 10or more, depending on the nature of the signal and the value of theempirically determinedthreshold.
3.5Data Reconstruction
The data array produced by the AZTEC algorithm is an alternating sequence ofdurations andamplitudes. A sample AZTEC-encoded data array is
We reconstruct the AZTEC data by expanding the plateaus and slopes into discretedatapoints. For this particular example, the first two points represent a line 18sample periods longat an amplitude of 77. The second set of two points representsanother line segment 4 sampleslong at an amplitude of 101. The first value in thethird set of two points is negative. Since thisrepresents the length of a line segment,and we know that length must be positive, we
recognize that this minus signis the flag indicating that this particular set of points representsa line segment withnonzero slope. This line is five samples long beginning at the end of the
previousline segment (i.e., amplitude of 101) and ending at an amplitude of 235. The nextsetof points is also a line with nonzero slope beginning at an amplitude of 235and ending 4sample periods later at an amplitude of 141.This reconstruction process produces an ECG signal with steplike quantization,which is notclinically acceptable. The AZTEC-encoded signal needs postprocessingwith a curvesmoothing algorithm or a low-pass filter to remove its jagged appearanceand produce moreacceptable output.The least square polynomial smoothing filter described in Chapter 5 is an easyand fastmethod for smoothing the signal. This family of filters fits a parabola to anodd number (2L +1) of input data points. TakingL = 3, we obtain
8/6/2019 Archit Seminar Report
19/32
swherepkis the new data point and xkis the expanded AZTEC data. Thesmoothing functionacts as a low-pass filter to reduce the discontinuities. Althoughthis produces more acceptable
output, it also introduces amplitude distortion.
Figure 3.4shows examples of the AZTEC algorithm applied to an ECG.
Figure 3.4Examples of AZTEC applications. (a) Original waveform generated by the
UWDigiScopeGenwavefunction (see Appendix D). (b) Small threshold, reduction ratio
= 512:233,PRD = 24.4%. (c) Large threshold, reduction ratio = 512:153, PRD = 28.1%.
(d) Smoothed signalfrom (c), L = 3, PRD = 26.3%.
8/6/2019 Archit Seminar Report
20/32
3.6Cortes Algorithm
The CORTES (Coordinate Reduction Time Encoding System) algorithm is a hybridof the TPand AZTEC algorithms. It attempts to exploit the strengths of each while sidesteppingtheweaknesses. CORTES uses AZTEC to discard clinically insignificant data inthe isoelectricregion with a high reduction ratio and applies the TP algorithm tothe clinically significant
high-frequency regions (QRS complexes). It executes theAZTEC and TP algorithms inparallel on the incoming ECG data.Whenever an AZTEC line is produced, the CORTES algorithm decides, basedon the lengthof the line, whether the AZTEC data or the TP data are to be saved. Ifthe line is longer thanan empirically determined threshold, it saves the AZTECline. Otherwise it saves the TP data
points. Since TP is used to encode the QRScomplexes, only AZTEC plateaus, not slopes, areimplemented.The CORTES algorithm reconstructs the signal by expanding the AZTECplateaus andinterpolating between each pair of the TP data points. It then appliesparabolic smoothing tothe AZTEC portions to reduce discontinuities.
3.7Fan Algorithm
Originally used for ECG telemetry, the Fan algorithm draws lines between pairs ofstartingand ending points so that all intermediate samples are within some specifiederror tolerance.Figure 10.6 illustrates the principlesof the Fan algorithm. We start by accepting the firstsample X0 as the nonredundantpermanent point. It functions as the origin and is also calledthe originating point.We then take the second sample X1 and draw two slopes {U1,L1}. U1
passesthrough the point (X0, X1 I), andL1 passes through the point (X0, X1 I). If thethirdsample X2 falls within the area bounded by the two slopes, we generate twonew slopes {U2,
L2} that pass through points (X0, X2 I) and (X0, X2 I). Wecompare the two pairs ofslopes and retain the most converging (restrictive) slopes(i.e., {U1, L2} in our example).
Next we assign the value ofX2 to X1 and read thenext sample into X2. As a result, X2 alwaysholds the most recent sample and X1holds the sample immediately preceding X2. We repeatthe process by comparingX2 to the values of the most convergent slopes. If it falls outsidethis area, we savethe length of the line Tand its final amplitude X1 which then becomes theneworiginating point X0, and the process begins anew. The sketch of the slopes drawnfromthe originating sample to future samples forms a set of radial lines similar to afan, giving thisalgorithm its name.When adapting the Fan algorithm to C-language implementation, we create thevariables,XU1, XL1, XU2,and XL2,to determine the bounds ofX2. From Figure10.6(b), we can showthat
and
8/6/2019 Archit Seminar Report
21/32
whereT= tTt0.
Figure 3.6 Illustration of the Fan algorithm. (a) Upper and lower slopes (Uand L) are
drawnwithin error threshold Iaround sample points taken at t1, t2, (b) Extrapolation
ofXU2 andXL2from XU1, XL1, and X0.
Figure 3.7 shows the C-language fragment that implements the Fan algorithm.
Figure 3.8 shows an example of the Fan algorithm applied to an ECG signal.
8/6/2019 Archit Seminar Report
22/32
Figure 3.7 Fragment of C-language program for implementation of the Fan algorithm.
8/6/2019 Archit Seminar Report
23/32
Figure 3.8 Examples of Fan algorithm applications. (a) Original waveform generated by
theUW DigiScopeGenwavefunction (see Appendix D). (b) Small tolerance, reductionratio =512:201 PRD = 5.6%. (c) Large tolerance, reduction ratio = 512:155, PRD =
7.2%. (d) Smoothedsignal from (c), L = 3, PRD = 8.5%.
We reconstruct the compressed data by expanding the lines into discrete points.The Fanalgorithm guarantees that the error between the line joining any two permanentsample pointsand any actual (redundant) sample along the line is less thanor equal to the magnitude of the
preset error tolerance. The algorithms reductionratio depends on the error tolerance. Whencompared to the TP and AZTEC algorithms,the Fan algorithm produces better signal fidelity
for the same reductionratio.Three algorithms based on Scan-Along Approximation (SAPA) techniquesclosely resemblethe Fan algorithm.
8/6/2019 Archit Seminar Report
24/32
The SAPA-2algorithm produces the best results among all three algorithms. As in theFanalgorithm, SAPA-2 guarantees that the deviation between the straight lines(reconstructed
signal) and the original signal never exceeds a preset errortolerance.In addition to the two slopes calculated in the Fan algorithm, SAPA-2 calculatesa third slope
called the center slope between the originating sample point and theactual future sample point. Whenever the center slope value does not fall withinthe boundary of the two
converging slopes, the immediate preceding sample istaken as the originating point.Therefore, the only apparent difference betweenSAPA-2 and the Fan algorithm is that theSAPA-2 uses the center slope criterionbinstead of the actual sample value criterion.
3.8Huffman Coding
Huffman coding exploits the fact that discrete amplitudes of quantized signal donot occurwith equal probability. It assigns variable-length codewords to a given quantized datasequence according to their frequency of occurrence.Data that occur frequently are assignedshorter code words.
3.8.1Static Huffman Coding
Figure 10.9 illustrates the principles of Huffman coding. As an example, assumethat we wishto transmit the set of 28 data points
The set consists of seven distinct quantized levels, orsymbols. For each symbol, Si,wecalculate its probability of occurrencePi by dividing its frequency of occurrenceby 28, the
total number of data points. Consequently, the construction of aHuffman code for this set begins with seven nodes, one associated with eachPi. Ateach step we sort the Pi list indescending order, breaking the ties arbitrarily. Thetwo nodes with smallest probability, PiandPj, are merged into a new node withprobabilityPi + Pj. This process continues until the
probability list contains a singlevalue, 1.0, as shown in Figure 10.9(a).
8/6/2019 Archit Seminar Report
25/32
Figure 3.9 Illustration of Huffman coding. (a) At each step, Piare sorted in descending
orderand the two lowest Piare merged. (b) Merging operation depicted in a binary tree.
(c) Summaryof Huffman coding for the data set.
8/6/2019 Archit Seminar Report
26/32
The process of merging nodes produces a binary tree as in Figure 10.9(b). Whenwe mergetwo nodes with probabilityPi +Pj, we create a parent node with twochildren represented by
Pi and Pj. The root of the tree has probability 1.0. We obtainthe Huffman code of thesymbols by traversing down the tree, assigning 1 tothe left child and 0 to the right child. The
resulting code words have the prefixproperty (i.e., no code word is a proper prefix of anyother code word). This propertyensures that a coded message is uniquely decodable without
the need forlookahead. Figure 10.9(c) summarizes the results and shows the Huffman codesforthe seven symbols. We enter these code word mappings into a translation table anduse thetable to pad the appropriate code word into the output bit stream in the reductionprocess.The reduction ratio of Huffman coding depends on the distribution of the source symbols. Inour example, the original data requires three bits to represent the sevenquantized levels. AfterHuffman coding, we can calculate the expected code wordlength.
whereli represents the length of Huffman code for the symbols. This value is 2.65in ourexample, resulting in an expected reduction ratio of 3:2.65.The reconstruction process begins at the root of the tree. If bit 1 is received, wetraverse downthe left branch, otherwise the right branch. We continue traversinguntil we reach a node withno child. We then output the symbol corresponding tothis node and begin traversal from theroot again.The reconstruction process of Huffman coding perfectly recovers the originaldata. Thereforeit is a lossless algorithm. However, a transmission error of a single bit may result in morethan one decoding error. This propagation of transmissionerror is a consequence of allalgorithms that produce variable-length code words.
3.8.2Modified Huffman Coding
The implementation of Huffman coding requires a translation table, where eachsourcesymbol is mapped to a unique code word. If the original data werequantized into 16-bit
numbers, the table would need to contain 216 records. A tableof this size creates memoryproblems and processing inefficiency.
In order to reduce the size of the translation table, the modified Huffman codingschemepartitions the source symbols into a frequent set and an infrequent set. For all the symbols in
the frequent set, we form a Huffman code as in the staticscheme. We then use a special codeword as a prefix to indicate any symbol fromthe infrequent set and attach a suffix
corresponding to the ordinary binary encodingof the symbol.
Assume that we are given a data set similar to the one before. Assume also thatwe anticipatequantized level 0 to appear in some future transmissions. We may decide to partition thequantized levels {0, 7} into the infrequent set. We thenapply Huffman coding as before andobtain the results in Figure 10.10. Note thatquantized levels in the infrequent set have codeswith prefix 0100, making theircode length much longer than those of the frequent set. It istherefore important tokeep the probability of the infrequent set sufficiently small to achieve areasonablereduction ratio.
8/6/2019 Archit Seminar Report
27/32
Some modified Huffman coding schemes group quantized levels centered about0 into thefrequent set and derive two prefix codes for symbols in the infrequentset. One prefix code
denotes large positive values and the other denotes largenegative values.
3.9Adaptive Coding
Huffman coding requires a translation table for encoding and decoding. It isnecessary toexamine the entire data set or portions of it to determine the datastatistics. The translationtable must also be transmitted or stored for correctdecoding.An adaptive coding scheme attempts to build the translation table as data arepresented. Adynamically derived translation table is sensitive to the variation inlocal statisticalinformation. It can therefore alter its code words according to localstatistics to maximize thereduction ratio. It also achieves extra space savingbecause there is no need for a static table.An example of an adaptive scheme is the Lempel-Ziv-Welch (LZW) algorithm.The LZW algorithm uses a fixed-size table. It initializes some positions of the tablefor somechosen data sets. When it encounters new data, it uses the uninitializedpositions so that eachunique data word is assigned its own position. When thetable is full, the LZW algorithmreinitializes the oldest or least-used positionaccording to the new data. During datareconstruction, it incrementally rebuilds thetranslation table from the encoded data.
Figure 3.8Results of modified Huffman coding. Quantized levels {0, 7} are grouped into
theinfrequent set
3.10Residual Differencing
Typically, neighboring signal amplitudes are not statistically independent.Conceptually wecan decompose a sample value into a part that is correlated withpast samples and a part that isuncorrelated. Since the intersample correlation correspondsto a value predicted using pastsamples, it is redundant and removable.We are then left with the uncorrelated part whichrepresents the prediction error orresidual signal. Since the amplitude range of the residualsignal is smaller than thatof the original signal, it requires less bits for representation. We canfurther reducethe data by applying Huffman coding to the residual signal. We brieflyssdescribetwo ECG reduction algorithms that make use of residual differencing.
8/6/2019 Archit Seminar Report
28/32
Ruttimann and Pipbergerapplied modified Huffman coding to residualsobtained from prediction and interpolation. In prediction, sample values are obtainedby taking a linearly
weighted sum of an appropriate number of past samples
wherex(nT) are the original data, x'(nT) are the predicted samples, and p is thenumber ofsamples employed in prediction. The parameters akare chosen to minimizethe expected meansquared errorE[(x x')2]. When p= 1, we choose a1 = 1and say that we are taking thefirstdifference of the signal. Preliminary investigationson test ECG data showed that there was nosubstantial improvement by usingpredictors higher than second order. In interpolation,theestimator of the sample value consists of a linear combination of past and futuresamples.The results for the predictor indicated a second-order estimator to be sufficient.Therefore, theinterpolator uses only one past and one future sample
where the coefficients a and b are determined by minimizing the expected meansquared error.The residuals of prediction and interpolation are encoded using amodified Huffman codingscheme, where the frequent set consists of somequantized levels centered about zero.
Encoding using residuals from interpolationresulted in higher reduction ratio ofapproximately 7.8:1.
Hamilton and Tompkins exploited the fact that a typical ECGsignal is composed of arepeating pattern of beats with little change from beat tobeat. The algorithm calculates and
updates an average beat estimate as data arepresented. When it detects a beat, it aligns and
subtracts the detected beat from theaverage beat. The residual signal is Huffman coded andstored along with a recordof the beat locations. Finally, the algorithm uses the detected beatto update theaverage beat estimate. In this scheme, the estimation of beat location and
quantizerlocation can significantly affect reduction performance.
3.11Run-Length Encoding
Used extensively in the facsimile technology, run-length encoding exploits thehigh degree ofcorrelation that occurs in successive bits in the facsimile bit stream.A bit in the facsimileoutput may either be 1 or 0, depending on whether it is ablack or white pixel. On a typicaldocument, there are clusters of black and whitepixels that give rise to this high correlation.
Run-length encoding simplytransforms the original bit stream into the string {v1, l1, v2, l2,} where vi are thevalues and li are the lengths. The observant reader will quickly recognizethat bothAZTEC and the Fan algorithm are special cases of run-length encoding.Take for example the output stream {1, 1, 1, 1, 1, 3, 3, 3, 3, 0, 0, 0} with 12 elements. Theoutput of run-length encoding {1, 5, 3, 4, 0, 3} contains only sixelements. Further datareduction is possible by applying Huffman coding to theoutput of run-length encoding.
8/6/2019 Archit Seminar Report
29/32
3.12LAB: ECG Data Reduction Algorithms
This lab explores the data reduction techniques reviewed in this chapter. Load UWDigiScopeaccording to the directions in Appendix D.
3.12.1Turning Point Algorithm
From the ad(V) Ops menu, select C(O)mpressand then (T)urn pt. Theprogramcompresses the waveform displayed on the top channel using the TPalgorithm, thendecompresses, reconstructs using interpolation, and displays theresults on the bottomchannel. Perform the TP algorithm on two different ECGsread from files and on a sine waveand a square wave. Observe
1. Quality of the reconstructed signal2. Reduction ratio3. Percent root-mean-square difference (PRD)
4. Power spectra of original and reconstructed signals.
Tabulate and summarize all your observations.
3.12.2Aztec Algorithm
The AZTEC algorithm by selecting (A)ztecfrom theCOMPRESS menu. Using at leastthree different threshold values (try 1%, 5%, and15% of the full-scale peak-to-peak value),observe and comment on the items inthe list in section 10.5.1. In addition, summarize thequality of the reconstructedsignals both before and after applying the smoothing filter.Tabulate and summarizeall your observations.
3.12.3Fan Algorithm
The Fan algorithm by selecting (F)anfrom the COMPRESSmenu. What can you deducefrom comparing the performance of the Fan algorithmwith that of the AZTEC algorithm?Tabulate and summarize all your observations.
3.12.4Huffman Coding
Select (H)uffmanfrom the COMPRESS menu. Select (R)unin order to Huffman encodethesignal that is displayed on the top channel. Do not use first differencing at this point in theexperiment. Record the reduction ratio. Note that this reduction ratiodoes not include thespace needed for the translation table which must be storedor transmitted. What can youdeduce from the PRD? Select (W)rite table towrite the Huffman data into a file. You mayview the translation table later withthe DOS type command after exiting from SCOPE.Load a new ECG waveform and repeat the steps above. When you select (R)un,the programuses the translation table derived previously to code the signal. Whatcan you deduce from the
8/6/2019 Archit Seminar Report
30/32
reduction ratio? After deriving a new translation tableusing (M)akefrom the menu, select(R)un again and comment on the new reductionratio.Select (M)akeagain and use first differencing to derive a new Huffman code. Isthere achange in the reduction ratio using this newly derived code? Select(W)rite table to write the
Huffman data into a file. Now reload the first ECGwaveform that you used. Without derivinga new Huffman code, observe the reductionratio obtained. Comment on your observations.
Exit from the SCOPE program to look at the translation tables that you generated.Whatcomments can you make regarding the overhead involved in storing a translationtable?
8/6/2019 Archit Seminar Report
31/32
CHAPTER-4
CONCLUSION
The feeling of being in virtual contact with the health care professionals provides a sense of
safety to the subjects without the hassles of permanent monitoring. Offers a valuable tool foreasy measurement of ECG. Offers first hand help when ever patient requires immediate
medical attention.
The results achieved were quite satisfactory the DCT found to be compressed 90.43 with
PRD as 0.93. The signal analysis techniques using LABVIEW software where various
abnormalities are to be checked for and finally display the problem in ECG of a particular
patient was also given a positive indication, with this as the goal set the further
implementation in mat lab simulink work is also under the implementation stage.
The comparison table, the resultant compression techniques. This gives the choice to select
the best suitable compression method. Hence in this project the DCT found to be compressed90.43 with PRD as 0.93.
Comparison of compression techniques
METHOD COMPRESSION RATIO PRD
CORTES 4.8 3.75
TURNING POINT 5 3.20
AZTEC 10.37 2.42
FFT 89.57 1.16
DCT 90.43 0.93
8/6/2019 Archit Seminar Report
32/32
CHAPTER-5
REFERENCES
[1] S. Jalaleddine, C. Hutches, R. Stratan, and W.A.Coberly,(1990) ECG data compression
techniques-a unified approach, IEEE Trans.B iomed.Eng.,37,329-343.
[2] B.R.S.Reddy and I.S.N. Murthy.(1986) ECG Data Compression using Fourier descriptors.
IEEE Trans. Biomed.Eng.,BME-33,428-433
[3] D.C.Reddy, (2007) Biomedical signal processing-principles and techniques, 254-300,
Tata McGraw-Hill, Third reprint.
[4] Abenstein, J. P. and Tompkins, W. J. 1982. New data-reduction algorithm for real-timeECGanalysis,IEEETrans. Biomed. Eng., BME-29: 4348.
[5] Bohs, L. N. and Barr, R. C. 1988. Prototype for real-time adaptive sampling using the Fan
algorithm,Med. & Biol. Eng. & Compute., 26: 57483.
[6] Cox, J. R., Nolle, F. M., Fozzard, H. A., and Oliver, G. C. Jr. 1968. AZTEC: a pre-processingprogram for real-time ECG rhythm analysis. IEEETrans. Biomed. Eng., BME-15:12829.
[7] Huffman, D. A. 1952. A method for construction of minimum-redundancy codes.Proc.IRE, 40:10981101.
[8] Bousseljot, R.D., Kreiseler, D., ECG analysis by signal patterncomparison. BiomedicalEngineering 43, Pages 156157, 1998.