Upload
dangthuy
View
215
Download
1
Embed Size (px)
Citation preview
1. Design Objective:To design “Heart Rate counter Using MATLAB”, having features:
In this technique we are using CMOS camera to record video and by using
image acquisition technique and image manipulation, we can detect variation in frames
which are stored in the video signal.
When the camera is covered with a finger, the naked eye sees a static red frame,
but there are subtle variations caused by the blood flow under the skin that can be
recovered processing the video signal.
After the signal acquisition, a band-pass filter attenuates frequencies outside the
interest band. This reduces the noise in later processing steps (peak fine-tuning) and
makes the resulting heart rate signal smoother.
The Discrete Fourier Transform (DFT) is used to translate the signal from the
time domain to the frequency domain. The Fast Fourier Transform (FFT) algorithm
was used to save processing time when computing the DFT.
Once the FFT is computed for the current sliding window contents, magnitude
peaks in the interest band are spotted thanks to the findpeaks function in Matlab. A
sample is taken as a peak if it is either larger than its two neighbors or equal to infinity.
Among the resulting peaks, the highest peak position is sought with the max function.
2. Physical and Technical Description:
1
Physical description:
Operation system: Windows 8, Windows 7 Service Pack 1, Windows Vista Service Pack 2, Windows XP Service Pack 3, Windows XP x64 Edition Service Pack 2, Windows Server 2012, Windows Server 2008 R2 Service Pack 1, Windows Server 2008 Service Pack 2, Windows Server 2003 R2 Service Pack 2
Any Intel or AMD x86 processor supporting SSE2 instruction set* 1 GB for MATLAB only, 3–4 GB for a typical installation. 1024 MB (At least 2048 MB recommended) No specific graphics card is required. Hardware accelerated graphics card
supporting OpenGL 3.3 with 1GB GPU memory recommended.
Technical description: MATLAB Version 7.8.0.347 (R2009a)
MATLAB License Number: 161051
Operating System: Microsoft Windows Vista Version 6.1 (Build 7600)
Java VM Version: Java 1.6.0_04-b12 with Sun Microsystems Inc. Java
HotSpot(TM) Client VM mixed mode
Image Acquisition toolbox.
Image Processing toolbox.
Digital Signal Processing toolbox.
3. System description and block diagram:
3.1 Introduction:
2
Heart rate is measured by finding the pulse of the heart. This pulse rate can be found at
any point on the body where the artery's pulsation is transmitted to the surface by
pressuring it with the index and middle fingers; often it is compressed against an
underlying structure like bone. (A good area is on the neck, under the corner of the
jaw.) But for this project we shall use any of our fingers as they will
be pressed against the camera sensor with a flash light upon it.
What is a heart rate monitor, and how can it help improve your performance? A heart
rate monitor is a personal monitoring device that allows one to measure one's heart
rate in real time or record the heart rate for later study. It is largely used by performers
of various types of physical exercise.
In this era of technology where each one of us holds a smart phone with a CMOS
camera, can record his or her heart rate at any place. Previously we use to have devices
or wrist watches that used complex circuitry which weren’t cost efficient and multi-
purpose. Now this task has been simplified. With the help of a camera and flash, we
can easily monitor our heart rate and analyze it.
3.2 Applications: Maintaining track record of Exercise (Cycling).
Electro cardio gram plotting with Image processing.
Biometric use of image processing rather than analog method.
We use to have devices or wrist watches that used complex circuitry which
weren’t cost efficient and multi-purpose.
4. System Description:-
4.1. Video signal acquisition
3
The first thing to take into account when it comes to sampling a signal is bandwidth.
The normal human heartbeat is between 60 and 200 beats per minute (bpm), depending
on age, fitness condition and the physical activity that the subject is doing. In our
experiment, we have generously assumed that the target signal can be found between
40 and 230 bpm, i.e. 0.667 and 3.833 Hz. According to Nyquist, the sampling
frequency should be at least twice the highest value (7.667 Hz) to catch the whole
range of heartbeat frequencies without aliasing. Using a webcam we have recorded
video by covering it with a flash light at 30 fps which is three times the minimum.
4.2. Brightness signal computation
The signal we want to process is the brightness of the skin over time. We cannot ensure
that all pixels in the image will contain the brightness variation that we are looking for
and we want the rest of the processing pipeline to stay computationally light.
Therefore, we chose to combine all pixels into a single average brightness value per
frame. On the other hand, as we are thinking about implementing the algorithm in real
time, we have skipped the common image brightness computation —combining the
red, green and blue planes— in favor of a simple average of all the pixels in the red
plane. This is computationally much cheaper and gives very similar results because
almost all the image energy is in the red plane. So, the n-th sample of the red
brightness function can be expressed as:
Where:
W : width of the image in pixels
H : height of the image in pixels
v [n, x, y, 1] : light level of the red plane (index 1) at [x, y] coordinates of frame n in the
video signal
In MATLAB code, the red brightness signal can be computed with the following code
snippet:
4
As the image size is constant over time and we are only interested in the shape of the
signal, not in its amplitude, we could even omit the division by the total number of
pixels. We are leaving it like that because it will later help visualizing the different
spectrum peaks.
4.3. Band-pass filtering
After the signal acquisition, a band-pass filter attenuates frequencies outside the
interest band. This reduces the noise in later processing steps (peak fine-tuning) and
makes the resulting heart rate signal smoother. For our case, a second-order
Butterworth filter is designed. The cutoff frequencies have been set to contain our band
of interest: 40-230 bpm (Fig. 3). The Matlab code to design and apply the filter is:
Notice how an initial piece of 1 second is cut off the filtered signal. It is an
approximation of the time it takes for the filter to completely remove the constant
signal offset. If this initial piece is not removed, we might get bad readings of the heart
beat during the signal stabilization time.
5
Fig 4.3.1 Filter frequency response
Fig 4.3.2 Original signal (vs) Filtered Signal
On the left, the filter frequency response. On the right, original signal (blue) vs filtered
signal (green). Notice the transient during the first second.
Other filter types can be tested. We chose Butterworth because:
It is an IIR filter and the order required for a given bandwidth is much lower
than with a FIR filter. Lower order usually means less computations.
It has flat pass-band and stop-bands compared to other IIR structures that show
ripples. This avoids favoring certain frequencies over others in the valid range.
4.4. Fast Fourier Transform
The Discrete Fourier Transform (DFT) is used to translate the signal from the time
domain to the frequency domain. The Fast Fourier Transform (FFT) algorithm was
used to save processing time when computing the DFT. While the computational
complexity of the DFT is O(N2) for a set of N points, the FFT gets the same results
with O(N · log2(N)), which means a huge speed-up when N is high.
6
There is a special command to compute the FFT in Matlab. The FFT of a real signal is
a complex signal in which each complex sample represents the magnitude and the
phase of the corresponding frequency. In our case, the phase is not needed. The FFT
magnitude is easily computed in MATLAB.
4.5. Sliding window
In order to give a continuous estimation of the heart rate, the FFT and the following
two steps (peak detection and smoothing) are repeated every 0.5 seconds. This
computation is always performed over a window containing the last 6 seconds of
signal samples. Virtually, the window moves over the signal and this is why it is called
"sliding" or "moving" window.
The 6-second length is not arbitrary. The window length directly affects frequency
resolution and, thus, the accuracy of our estimation. The FFT of a signal
sampled N times at a sampling frequency Fs is N bins long. All the bins together cover
a bandwidth of Fs. So, the frequency difference between two consecutive bins is Fs / N.
This is the frequency resolution (Fr). As the sampling frequency can be written as the
number of window samples divided by the total time it took to sample them (the
window time duration), we can say that:
Therefore, the higher the window duration, the better the frequency resolution. The
accuracy will be better, too, as it is half the resolution in this case.
7
However, increasing the window duration decreases the time accuracy. Think about the
trivial case in which the whole signal length is picked as the window length. If a peak
is detected in the FFT, it is impossible to tell when that tone started within the signal or
how long it lasted. Whatever number we give will be a maximum of a window length
away from the real value. Another problem of a long window is that it will force the
user to wait for an equally long period to get a first reading after starting up the
measurements.
In summary, with a 6-second window, we get a tolerable 6-second startup delay that
gives a fair time accuracy of 6 seconds and a fair frequency accuracy of 5 bpm (half
the FFT resolution). All in all, it looks like a good trade-off.
Why do we move the window in 0.5-second steps? Computing an estimate every 0.5
seconds does not improve the time accuracy of the output, but it increases the time
resolution of the reading. It produces more heart rate output samples per second that
will be later smoothed to provide a more continuous and frequent reading. With this,
we are incrementing the time resolution of the reading but not its accuracy, which stays
limited by the time resolution of the FFT.
4.6. Leakage reduction:
Fig.4.6.1:Hann window for the 6-second sliding
window
8
The DFT works ideally with infinite-time signals. A time-limited signal of length N is
equivalent to multiplying its infinite-time counterpart by a rectangular signal of length
N and amplitude 1. Frequency-wise, this results in convolving the infinite-time signal
spectrum by the rectangular signal spectrum, producing leakage.
In order to reduce leakage, before computing the DFT, the input signal is multiplied by
a function whose boundaries are zero. This forces the resulting boundary values to
zero. The multiplying function is usually called "window". It must not be confused
with the sliding window that we have talked about in the previous section. There are
many window functions in the literature, each having their own virtues and
disadvantages. I have particularly chosen the Hann window because it offers good
resolution and good leakage rejection. Feel free to try others. There is a good
comparative analysis of different window functions here.
In Matlab, the FFT magnitude for the sliding window with the Hann window can be
computed with the following code. The result generated by the hann function is
transposed because it is a column vector, while our signal is a row vector:
4.7. Peak detection:
Once the FFT is computed for the current sliding window contents, magnitude peaks in
the interest band are spotted thanks to the findpeaks function in Matlab. A sample is
taken as a peak if it is either larger than its two neighbors or equal to infinity. Among
the resulting peaks, the highest peak position is sought with the max function. Finally,
it is translated to the corresponding frequency in the FFT vector:
9
4.8. Smoothing
At this stage, an approximate location for the most powerful tone in the frequency band
has been found, but the possible outcomes are a discrete set in 10 bpm increments
because of the frequency resolution produced by the 6-second window. We would like
that the heart rate readings look more continuous, with 1 bpm frequency resolution
instead. To achieve this, the signal window is correlated with a series of tones in phase
and quadrature around the FFT peak in 1 bpm increments. The tones lie in the
uncertainty interval around the peak caused by the FFT frequency resolution. The
result of each signal-tone correlation is a complex number representing a phase-
magnitude pair. The frequency that corresponds to the highest magnitude is taken as
the smoothed heart rate:
10
Fig 4.8.1 Smoothing FFT curve
Fig. 5: Smoothing of an FFT peak. The DFT is computed on a zero-padded version of
the signal in the window, but only within a range of FFT frequencies around the
highest peak. The resulting DFT amplitudes form the red curve above the FFT, in blue.
The frequency with the highest DFT amplitude, which is marked with the black
square, is chosen as the current smoothed heart rate.
At first, We came up with this method intuitively, thinking of the correlation as a
measure of similarity. My goal was to find the tone most similar to the signal in a
frequency range around the FFT peak, regardless of the phase, by measuring the
similarity (correlation) between the signal and a series of reference complex tones. In
fact, it is what the FFT does, but using only orthonormal tones, whose period is an
integer number of samples. Since the FFT frequencies are orthonormal, they constitute
a base in which all signals can be expressed by linear combination. Correlating against
other intermediate frequencies does not give more information because they
themselves are formed by linearly combining the orthonormal ones. This is the reason
why the result of this method is just smoothed FFT data, instead of higher accuracy
extra information. Later, I realized that this method is equivalent to computing the FFT
on a zero-padded version of the signal, which is a commonly used technique to smooth
the FFT data, and picking only a frequency range. If the original 6-second signal in the
window is zero-padded up to 60 seconds, the tone frequencies used in this method are
found among the orthonormal frequencies of the 60-second FFT. Then, the method is
11
equivalent to applying the DFT definition on the zero-padded window for certain
frequencies only.
The advantage of this "zero-padding partial DFT" over the "zero-padding FFT" is
speed. The FFT computes the result as a whole and cannot be divided to get subresults
in a continuous frequency range; therefore, its complexity stays at O(Nz · log2(Nz)),
where Nz is the length of the zero-padded window. On the other hand, the complexity
of the proposed method is O(S · N), where N is the window length and S is the number
of test tones. So, the complexity reduction factor (CRF) is:
For a given sampling frequency, the higher the new frequency resolution, the more
beneficial will be using the "zero-padding partial DFT" instead of the "zero-padding
FFT". In our case, Fr' = 1/60 Hz and Fs = 24 Hz in the worst case. So, theCRF is
approximately 10.5, which means that —roughly speaking— the time taken by the
proposed method will be something in the order of a 10% of the time taken by the FFT
on the zero-padded window.
Here is the Matlab code to implement the signal smoothing:
12
4.9. Actively increasing the SNR:
A good Signal-to-Noise Ratio (SNR) is essential to accurately detect the heart rate
signal. It can be improved by either reducing the noise or increasing the signal power.
In our case, the noise affecting the measurement comes from three sources basically:
Image noise : It originates in the camera sensor. By averaging all the pixels for
the brightness calculation, we are filtering out a big part of its spatial component. The
band-pass filter that we are applying to the brightness signal is also rejecting much of
its time component. However, there will always be some level of this noise in the band
of interest.
User behavior: The pressure variations of the fingertip against the camera lens
may show up in the signal. If the user has shaky hands, it is better that she uses a finger
of the hand holding the phone.
Lighting changes: Since the finger is lit from behind, any change in the light
sources or in the scene reflecting that light towards the camera may introduce noise in
our pass band. For instance, moving the phone in the air while recording the video
sequence may introduce artifacts if the scene parts in the field of view have very
different light intensities.
Whereas the noise sources are obviously difficult to control —especially if the system
is operated by someone else—, we can increase the signal level by illuminating the
finger with a stable light source.
13
5. Coding in MATLAB:
5.1. Acquire:
function y = acquire(video_file)
if ischar(video_file), display(['Loading file ' video_file]); v = VideoReader(video_file);Else v = video_file;EndnumFrames = v.NumberOfFrames;display(['Total frames: ' num2str(numFrames)]);y = zeros(1, numFrames);for i=1:numFrames, display(['Processing ' num2str(i) '/' num2str(numFrames)]); frame = read(v, i); redPlane = frame(:, :, 1); y(i) = sum(sum(redPlane)) / (size(frame, 1) * size(frame, 2)); Enddisplay('Signal acquired.');display(' ');display(['Sampling rate is ' num2str(v.FrameRate) '. You can now run process(your_signal_variable, ' num2str(v.FrameRate) ')']);End
5.2. Process:
% Parameters to play with
WINDOW_SECONDS = 6; % [s] Sliding window length
BPM_SAMPLING_PERIOD = 0.5; % [s] Time between heart rate
estimations
BPM_L = 40; BPM_H = 230; % [bpm] Valid heart rate range
FILTER_STABILIZATION_TIME = 1; % [s] Filter startup transient
CUT_START_SECONDS = 0; % [s] Initial signal period to cut
off
FINE_TUNING_FREQ_INCREMENT = 1; % [bpm] Separation between test tones
for smoothing
ANIMATION_SPEED_FACTOR = 2; % [] This makes the animation run
faster or slower than real time
14
% Build and apply input filter
[b, a] = butter(2, [(((BPM_L)/60)/fps*2) (((BPM_H)/60)/fps*2)]);
yf = filter(b, a, y);
y = yf((fps * max(FILTER_STABILIZATION_TIME, CUT_START_SECONDS))
+1:size(yf, 2));
% Some initializations and precalculations
num_window_samples = round(WINDOW_SECONDS * fps);
bpm_sampling_period_samples = round(BPM_SAMPLING_PERIOD * fps);
num_bpm_samples = floor((size(y, 2) - num_window_samples) /
bpm_sampling_period_samples);
fcl = BPM_L / 60; fch = BPM_H / 60;
orig_y = y;
bpm = [];
bpm_smooth = [];
max_freq_plot_amplitude = 0;
max_time_plot_bpm = 100;
min_time_plot_bpm = 50;
for i=1:num_bpm_samples,
% Fill sliding window with original signal
window_start = (i-1)*bpm_sampling_period_samples+1;
ynw = orig_y(window_start:window_start+num_window_samples);
% Use Hanning window to bring edges to zero. In this way, no
artificial
% high frequencies appear when the signal is treated as periodic
by the
% FFT
y = ynw .* hann(size(ynw, 2))';
gain = abs(fft(y));
% FFT indices of frequencies where the human heartbeat is
il = floor(fcl * (size(y, 2) / fps))+1; ih = ceil(fch * (size(y,
2) / fps))+1;
index_range = il:ih;
% Plot the amplitude of the frequencies of interest
15
figure(1);
subplot(2, 1, 1);
hold off;
fft_plot = plot((index_range-1) * (fps / size(y, 2)) * 60,
gain(index_range), 'b', 'LineWidth', 2);
hold on;
max_freq_plot_amplitude = max(max_freq_plot_amplitude,
max(gain(index_range)));
axis([BPM_L BPM_H 0 max_freq_plot_amplitude]);
grid on;
xlabel('Heart rate (BPM)');
ylabel('Amplitude');
title('Frequency analysis and time evolution of heart rate
signal');
% Find peaks in the interest frequency range and locate the
highest
[pks, locs] = findpeaks(gain(index_range));
[max_peak_v, max_peak_i] = max(pks);
max_f_index = index_range(locs(max_peak_i));
bpm(i) = (max_f_index-1) * (fps / size(y, 2)) * 60;
% Smooth the highest peak frequency by finding the frequency that
% best "correlates" in the resolution range around the peak
freq_resolution = 1 / WINDOW_SECONDS;
lowf = bpm(i) / 60 - 0.5 * freq_resolution;
freq_inc = FINE_TUNING_FREQ_INCREMENT / 60;
test_freqs = round(freq_resolution / freq_inc);
power = zeros(1, test_freqs);
freqs = (0:test_freqs-1) * freq_inc + lowf;
for h = 1:test_freqs,
re = 0; im = 0;
for j = 0:(size(y, 2) - 1),
phi = 2 * pi * freqs(h) * (j / fps);
re = re + y(j+1) * cos(phi);
16
im = im + y(j+1) * sin(phi);
End
power(h) = re * re + im * im;
End
[max_peak_v, max_peak_i] = max(power);
bpm_smooth(i) = 60*freqs(max_peak_i);
% Plot amplitudes in the fine-tuning interval
hold on;
smoothing_plot = plot(freqs*60, sqrt(power), 'r', 'LineWidth',
2);
% Plot legend
set(fft_plot, 'Displayname', 'FFT modulus');
set(smoothing_plot, 'Displayname', 'Peak smoothing');
legend('Location', 'NorthEast');
% Plot BPM over time
subplot(2, 1, 2);
t = (0:i-1) * ((size(orig_y, 2) / fps) / (num_bpm_samples - 1));
hold off;
plot(t, bpm_smooth, 'r', 'LineWidth', 2);
max_time_plot_bpm = max(max_time_plot_bpm, max(bpm_smooth));
min_time_plot_bpm = min(min_time_plot_bpm, min(bpm_smooth));
axis([0 ((size(orig_y, 2)-1) / fps) min_time_plot_bpm
max_time_plot_bpm]);
grid on;
xlabel('Time (s)');
ylabel('Heart rate (BPM)');
% Print speed factor over real time
box = uicontrol('style', 'text');
set(box, 'String', [num2str(ANIMATION_SPEED_FACTOR) 'x']);
set(box, 'Position', [512-42, 7, 40, 38]);
set(box, 'FontName', 'Ubuntu Condensed');
set(box, 'ForegroundColor', hex2dec({'88' '88' '88'})/255);
set(box, 'FontSize', 22);
set(box, 'BackgroundColor', hex2dec({'cc' 'cc' 'cc'})/255);
17
drawnow(); % Flush graphics
pause(BPM_SAMPLING_PERIOD / ANIMATION_SPEED_FACTOR);
End
disp(['Mean HR: ' num2str(mean(bpm_smooth)) ' bpm']);
End
List of Figures and tables:
Figure No.
Title of Figure Page numbers.
4.3.1 Filter Frequency response 64.3.2 Original signal vs filtered signal 64.6.1 Hann Window for 6 second sliding window 84.8.1 Smoothing FFT curve 11
18
References:
Digital Image Processing Using MATLAB by Gonzalez, Woods, &
Eddins.
DIGITAL SIGNAL PROCESSING (IND ADP): Principles, Algorithms and Applications Paperback by John G. Proakis , Dimitris G. Manolakis
en.wikipedia.org/wiki/Digital_image_processing
19
20