Upload
others
View
22
Download
0
Embed Size (px)
Citation preview
Signal Processing and AnalysisMultidimensional processing– Images and video
W
V
Benny Thörnberg
Associate professor
in electronics
Copyright (c) Benny Thörnberg 1:29
Outline
•Multidimensional signals
•Convolution in 2D or more…
•Separable convolution
•Video filters with temporal behaviour
•Signal processing for camera surveillance
•Multidimensional processing in frequency domain
Copyright (c) Benny Thörnberg 2:29
Images and video
n
m
f
The intensity � of a single pixel is a function of
three dimensions, I =� �,�, � .
� is the frame index of sampling in the time
domain. � and � are indexes of sampling in the
spatial image domain.
The bit-width of a typical data bus is 32 or 64 bits which will allow for access or transfer of a one
or a few pixels per clock cycle. Access and transfer of images or video will thus happen in serial.
Copyright (c) Benny Thörnberg 3:29
Images and video
n
m
f
f =1;forever do {
for m =1 to rMax {n = 1 to cMax {
VideoData = I(f,m,n);
}}f = f + 1;
}
Pixel clock
FrameValid
RowValid
VideoData 1 2 43 5 76 8 109
Explicit synchronization signals
One or more pixels per
clock cycle on a parallel data bus
Transfer of video frames in progressive scan order
Copyright (c) Benny Thörnberg 4:29
Convolution in 2D or more …
� � = � � ∗ ℎ � = � � � · ℎ � − ��
����= � ℎ � · � � − �
�
����1D:
� �, � = � �, � ∗ ℎ �, � = � � � �, � · ℎ � − �, � − ��
����
�
����2D:
� �, �, � = � �, �, � ∗ ℎ �, �, � = � � � � �, �, � · ℎ � − �, � − �, � − ��
����
�
����
�
����3D:
Copyright (c) Benny Thörnberg 5:29
Geometrical interpretation of
convolution in 2D
� �, � = � �, � ∗ ℎ �, � = � � � �, � · ℎ � − �, � − ��
����
�
����2D:
n
m
The 2D filter mask holds the
coefficients of a 2D FIR filter
whose output is computed in a
predefined order of sequence.
ℎ�,�
ℎ��,�
ℎ�,�� ℎ�,�
ℎ��,�� ℎ��,�
ℎ�,�ℎ�,�� ℎ�,�
A corresponding neighborhood
of input data is accessed at every
position where output data is
computed at its center. This
neighborhood is often referred to
as a “sliding window”
��,�
���,�
��,�� ��,�
���,�� ���,�
��,���,�� ��,�
Copyright (c) Benny Thörnberg 6:29
Images and video
�� ��
1 1 1 1 1 1
��,�
���,�
��,�� ��,�
���,�� ���,�
��,���,�� ��,�Input video stream
Sliding window
��,� ��,� ��,�� ��,� ��,� ��,�� ���,� ���,� ���,��
The flow graph above show how FIFO registers are used to store data from a progressive
scan pixel stream. �� means that the FIFO has a length equal to the number of pixels in
one row. Data outputs at the bottom of the graph allow for simultaneous access of all
pixels within the sliding window.
Copyright (c) Benny Thörnberg 7:29
Separable filter kernels
� �, � = � �, � ∗ ℎ �, � = � � ℎ �, � · � � − �, � − ��
����
�
����2D:
The 2D impulse response of the filter ℎ �, � is separable if it can be expressed as the
outer product of two vectors ℎ = ℎ�⨂ℎ� = ℎ� · ℎ��.
ℎ�and ℎ� are Sx1 and Tx1 column vectors
� �, � = � �, � ∗ ℎ �, � = � � ℎ� � · ℎ� � · � � − �, � − ��
����
�
����
= � ℎ� � � ℎ� � · � � − �, � − ��
����= ℎ� � ∗ ℎ� � ∗ � �, �
�
����
This means that the separable 2D convolution ℎ can be divided into two separate 1D
convolutions ℎ� and ℎ�.
Copyright (c) Benny Thörnberg 8:29
Why 2D convolution in two steps?
Convolving an M-by-N image using a filter mask of size S-by-T roughly requires
� !" multiply and adds.
The same 2D convolution but divided on first S-by-1 and then 1-by-T convolutions
requires NM(S+T) multiply and adds.
The expected speedup of computation on a microprocessor will thus be #$%�#$ %&� =
%�%&�
For larger filter masks, the expected speedup can be substantial, e.g. an 11-by-11 mask
results in a speedup of 5.5
If instead parallel computation in hardware is used, there will be a 5.5 times reduction
of allocated arithmetic resources.
Copyright (c) Benny Thörnberg 9:29
Example - Gaussian LP filter
A 2D Gaussian shaped filter mask is often used in image processing for noise suppression.
The impulse response of this filter is denoted as,
'( �, ) =1
2,-� .�/0&�0�(0
We can easily divide this impulse response into a product of two separate responses for r
and c dimensions,
'( �, ) = ℎ�( � · ℎ�( ) =1
- 2,1 .�/0�(0 · 1
- 2,1 .��0�(0
This also means that the kernel '( �, ) is separable in its two dimensions.
Copyright (c) Benny Thörnberg 10:29
Example - Gaussian LP filter
The graph shows the 2D Gaussian impulse
response (filter mask) for - = 0.8
-2
-1
0
1
2
-2
-1
0
1
20
0.05
0.1
0.15
0.2
0.25
rc
H
0.0005 0.0050 0.0109 0.0050 0.0005
0.0050 0.0521 0.1139 0.0521 0.0050
0.0109 0.1139 0.2487 0.1139 0.0109
0.0050 0.0521 0.1139 0.0521 0.0050
0.0005 0.0050 0.0109 0.0050 0.0005
'( �, ) =
Impulse response
Amplitude response function ℱ
Copyright (c) Benny Thörnberg 11:29
Example - Gaussian LP filter
The graph shows the 1D Gaussian impulse response (filter mask) for - = 0.8
ℎ�( � =0.02190.22830.49870.22830.0219
Copyright (c) Benny Thörnberg 12:29
Example - Gaussian LP filter
These graphs show the 2D amplitude responses for the two separate operations
ℎ�( � =0.02190.22830.49870.22830.0219
ℎ�(� ) = 0.0219 0.2283 0.4987 0.2283 0.0219
ℱ
ℱ
ℱ ℎ�(
ℱ ℎ�(
Copyright (c) Benny Thörnberg 13:29
Example - Gaussian LP filter
Thus, we can compute the same 2D filter mask from the two separate vectors
'( �, ) = ℎ�( � · ℎ�(� ) =0.02190.22830.49870.22830.0219
· 0.0219 0.2283 0.4987 0.2283 0.0219
'( �, ) =
0.0005 0.0050 0.0109 0.0050 0.0005
0.0050 0.0521 0.1139 0.0521 0.0050
0.0109 0.1139 0.2487 0.1139 0.0109
0.0050 0.0521 0.1139 0.0521 0.0050
0.0005 0.0050 0.0109 0.0050 0.0005
Copyright (c) Benny Thörnberg 14:29
Example - Gaussian LP filter
Equally, we can multiply the two separate amplitude responses and get the 2D
amplitude response for the original separable filter, ℱ '( = ℱ ℎ�( · ℱ ℎ�(
ℱ ℎ�( ℱ ℎ�(
ℱ ℎ�( · ℱ ℎ�(
Copyright (c) Benny Thörnberg 15:29
Example - Gaussian LP filter
Applied on an image
Copyright (c) Benny Thörnberg 16:29
Video filters with temporal behaviour
• A video filter is said to be temporal if the sliding
window includes data from more than one frame
• A video filter is said to be spatio-temporal if the
sliding window includes more than one frame and
more than one pixel in each frame
��,�
���,�
��,�� ��,�
���,�� ���,�
��,���,�� ��,�
Sliding window
��,�
���,�
��,�� ��,�
���,�� ���,�
��,���,�� ��,�
;�,�
;��,�
;�,�� ;�,�
;��,�� ;��,�
;�,�;�,�� ;�,�
n
n-1
n-2
Time [frame index]
• The sliding window to the right depicts an
example of a spatio-temporal video filter
operating on three consecutive frames
Copyright (c) Benny Thörnberg 17:29
Video filters with temporal behaviour
Input video stream
�� ��
1 1 1 1 1 1
��,� ��,� ��,�� ��,� ��,� ��,�� ���,� ���,� ���,��
�� ��
1 1 1 1 1 1
��,� ��,� ��,�� ��,� ��,� ��,�� ���,� ���,� ���,��
�� ��
1 1 1 1 1 1
;�,� ;�,� ;�,�� ;�,� ;�,� ;�,�� ;��,� ;��,� ;��,��
����
<����
<
• This diagram shows how pixels in this
sliding window can be accessed in real-
time using a memory architecture
• Frame buffers, line buffers and registers
are used to hold data dependencies
Copyright (c) Benny Thörnberg 18:29
Example: video filter for surveillance
• A typical task for a smart video
surveillance camera is to detect motion
• A car park is typically a static scene
until someone is entering the surveilled
area
• One typical approach is to compute an
estimated background picture that can
change slowly with increased/decreased
light and shadows while sudden motion
should be detected
Copyright (c) Benny Thörnberg 19:29
A temporal video filter - Surveillance
• A low pass filter applied in the temporal dimension is used to compute a background image
• Filter coefficients for this IIR filter was computed using Matlab sptool
• Sampling frequency Fs = 24 Hz (frames per second)
• Fpass = 0.05 Hz and Fstop = 0.5 Hz
• Max ripple in pass band = 3 dB
• Min attenuation in stop band is 80 dB
Camera output Computed background image
Copyright (c) Benny Thörnberg 20:29
Background computation
+
F
F
-d1
x[n,r,c] c0
c1
-d2c2
+
+
y[n,r,c]
F
-d4c4 +
F
-d3c3 +
• n,r,c are indexes for frame, row, collumn
• Four frame buffers are used to store
intermediate values in the feedback loop
• This LP filter is thus used to process every
pixel in the temporal dimension
Copyright (c) Benny Thörnberg 21:29
Detect objects in motion
Background
computation +-Camera
Local cross
correlation
1 2 3
4
1
2 3
4
Copyright (c) Benny Thörnberg 22:29
Video
surveillance
Demonstrated signal processing
is capable of emphasizing
objects in motion
Processed data is
multidimensional:
• Frame
• Row
• Column
• Color channel
Copyright (c) Benny Thörnberg 23:29
Frequency domain – DFT - IDFT
{ } ∑∑−
=
−
=
+−
⋅≡=1
0
1
0
)(2
],[],[],[M
x
N
y
N
vy
M
uxj
eyxfvuFyxfFπ
2-dimensional discrete Fourier transform DFT
{ } ∑−
=
−
⋅≡=1
0
2
][][][N
x
N
uxj
exfuFxfF
π
1-dimensional discrete Fourier transform DFT
{ } ∑∑−
=
−
=
+−
⋅≡=1
0
1
0
)(21 ],[],[],[
M
u
N
v
N
vy
M
uxj
evuFyxfvuFFπ
2-dimensional inverse discrete Fourier transform IDFT
Copyright (c) Benny Thörnberg 24:29
Frequency domain - Examples
Spatial domainAmplitude spectrum
of Frequency domain
F
F
Copyright (c) Benny Thörnberg 25:29
Frequency domain - Examples
Spatial domain
F
F
Amplitude spectrum
of Frequency domain
Copyright (c) Benny Thörnberg 26:29
Image filtering in Frequency domain
Reference: R.C. Gonzales and R.E. Woods, Digital Image Processing, Addison-Wesley
Copyright (c) Benny Thörnberg 27:29
Image smoothing in Frequency domain
Reference: R.C. Gonzales and R.E. Woods, Digital Image Processing, Addison-Wesley
Copyright (c) Benny Thörnberg 28:29
Image smoothing in Frequency domain
Reference: R.C. Gonzales and R.E. Woods, Digital Image Processing, Addison-Wesley
Original Radii=5
Radii=30
Radii=230Radii=80
Radii=15
2:nd order Butterworth LPF
Copyright (c) Benny Thörnberg 29:29