pr_report

Embed Size (px)

Citation preview

  • 7/31/2019 pr_report

    1/11

    CMPE264: Image Analysis and Computer Vision

    Final Project Report

    UAV Video Stabilization

    Mariano I. Lizarraga ([email protected])

    Sonia Arteaga ([email protected])

    August 15, 2007

  • 7/31/2019 pr_report

    2/11

    CMPE264: Image Analysis and Computer Vision

    1 Introduction

    Unmanned Aerial Vehicles (UAVs) have slowly started to permeate to civilianand law enforcement applications. Once thought to be exclusively employedby the military, the UAVs are being used today in border surveillance, whaleand other sea mammal tracking, and search and rescue missions in disasterareas. The usefulness of the imagery that these flying robots relay to thecontrolling ground stations in military applications is directly related to howmuch information the operator can extract from the frames being watchedin real-time. Image quality, RF noise rejection, and image stabilization havecome to play an important role in the overall performance measurementof the mission. While many of the mainstream UAVs have sophisticated

    equipment onboard to take care of the above mentioned problems, the pricetag is usually in the order of millions of dollars, making them unaffordableabout any other situation except military applications.

    Recent advances in solid-state sensors and overall reduction of the elec-tronics have made possible to noticeably improve the smaller UAVs capab-ilities. Nevertheless, this smaller UAVs are now more sensible to naturaloscillations of the airplane and wind turbulence, thus degrading the stabilityof the imagery served to the end user.

    The Naval Postgraduate School (NPS) has been performing experimentalflights on tactical UAVs since 2001 in order to develop technology that sup-ports U.S. troops in different peace-keeping scenarios around the world.These small UAVs carry visual and IR cameras used to relay video downto a ground station providing vital information for the deployed team. Eventhough these UAVs have autopilots and robust control schemes implementedon board, it is practically impossible to completely eliminate vibration andoscillations due to external disturbances and natural behavior of the plane.These oscillations get mechanically transmitted to the camera and as a con-sequence the relayed video is difficult to watch and makes it exhausting forthe operator to evaluate.

    To address the issue of oscillations and low frequency vibrations in therecorded imagery, implementation of an image stabilization algorithm is re-

    quired to improve visual quality. Furthermore, the stabilization algorithmneeds to be robust and computationally inexpensive to perform in real time,and to run on the PC104 computer that is available at the ground station.

    Final Project Report 1 August 15, 2007

  • 7/31/2019 pr_report

    3/11

    CMPE264: Image Analysis and Computer Vision

    2 Implementation

    Image stabilization for moving platforms is usually related to high-frequencyunwanted motion compensation. Many of the widely known algorithms, likethe ones presented in [1], are very sensitive to panning and rotation, thus ren-dering them useless for applications where intentional panning and rotationare part of the application.

    The image stabilization algorithm, and the Simulink implementation presen-ted herein, follows directly the work presented in [2], showing very stable be-havior for intentional panning and rotation. This algorithm offered promisingresults in stabilizing the UAV footage provided by the Unmanned SystemsLab at the Naval Postgraduate School. The implemented frame motion com-

    pensation follows the one proposed in [3].Simulink, a model based engineering tool developed by The Mathworks

    (makers of Matlab), was picked as development platform for this project dueto its block-oriented design paradigm, offering great ease of use and betterunderstanding of each functional block of the algorithm.

    The presented algorithm consist of five main functional blocks shown inFigure 1:

    Figure 1: Top Level Simulink Diagram

    Video reading and grayscale conversion,

    Gray-Code calculation,

    Sub-frame correlation measure calculation,

    Global motion calculation, and

    Motion compensation.

    Final Project Report 2 August 15, 2007

  • 7/31/2019 pr_report

    4/11

    CMPE264: Image Analysis and Computer Vision

    2.1 Video Reading and Grayscale Conversion

    The first step is to read frame-by-frame the video, and convert each frame intoa grayscale 8-bit image, which will be the actual input to the algorithm. Thisis performed in Simulink with the block layout shown in Figure 2. Note thatbefore the output the video stream is down-sampled by two, thus completelyignoring every other frame. This was done to improve the throughput andincrease the frame-rate of the output.

    Figure 2: Grayscale Frame Conversion and Downsampling

    2.2 Gray-Code Calculation

    This functional block, decomposes the frame into 8 binary images ak, calledbit plane images, such that the frame f at time t is given by [2],

    ft(x, y) = aK12K1 + aK22

    k2 + . . . + a121 + a02

    0 (1)

    Figure 3 shows the 8 bit plane decomposition of a given frame.The next part of this functional block calculates the Gray-Code of two

    successive bit plane images. The Gray-Code, named after Bell ResearcherFrank Gray, is a binary numeral system where two successive numbers onlydiffer in one digit [4]. The Gray-Coded Bit Plane image is given by:

    gk = ak ak+1 0 k 6. (2)

    It is this gray-coded image gk that is passed onto the next functionalblock.

    Final Project Report 3 August 15, 2007

  • 7/31/2019 pr_report

    5/11

    CMPE264: Image Analysis and Computer Vision

    Figure 3: 8 Bit Plane Frame Decomposition

    2.3 Sub-frame Correlation Measure

    This functional block divides the gray-coded image gk into four regions ofsize MN and defines a search window of size (M+ 2p) (N+ 2p), whichis explored in turn to calculate the following correlation measure [2]:

    Cj(m,n) =

    1

    MN

    M1

    x=0

    N1

    y=0

    gt

    k(x, y) gt1

    k (x + m, y + n). (3)

    Therefore Cj will act as an accumulator to count the number of non corres-pondences between gtk and g

    t1k , thus the smaller, the better. Figure 4 shows

    the Simulink implementation of this functional block.

    Final Project Report 4 August 15, 2007

  • 7/31/2019 pr_report

    6/11

    CMPE264: Image Analysis and Computer Vision

    Figure 4: Correlation Measure Calculation

    2.4 Global Motion Calculation

    This functional block chooses the minimum correlation measure Cj of eachregion (thus the best match) and the actual coordinates inside the matrixcorrespond to the local motion vector Vj obtained as:

    Vj = min{Cj(m,n)}. (4)

    These motion vectors Vj are stacked together with the global motionvector Vt1g from the previous frame and passed trough a median filter toobtain the current global motion vector Vtg :

    Vtg = median{Vt1 , V

    t2 , V

    t3 , V

    t4 , V

    t1g }. (5)

    Figure 5 shows the Simulink implementation of this functional block.

    2.5 Motion Compensation

    Since motion could be originated by intentional panning, the global motion

    vector Vg needs to be damped to allow for smooth panning, given by:Vta = D1V

    t1a + V

    tg (6)

    With this motion vector, the original frame image is thus relocated toremove the unwanted motion and still keep intentional panning. Figure 6

    Final Project Report 5 August 15, 2007

  • 7/31/2019 pr_report

    7/11

    CMPE264: Image Analysis and Computer Vision

    Figure 5: Global Motion Calculation

    shows the Simulink implementation of this functional block. Note that beforethe output a frame rate transition block is included to keep the frame rateconstant, taking into account the down-sample mentioned in Subsection 2.1

    Figure 6: Motion Compensation

    3 Results

    Several tests were run using different region sizes in order to quantify thevariability in the values of the motion vectors, since the value of the motionvectors affects the visual quality of the motion compensated video footage.

    The block sizes ranged a minimum of 6 6 pixels in increments of 25 upto a maximum of 106 106 pixels. A fixed p in equation 3 value of 8 wasused. Then the motion vectors of the first 16 frames were plotted in a bargraph to show the variability from frame to frame for the specified block size.Figure 7 shows that the results for the block sizes 56106 contain practically

    Final Project Report 6 August 15, 2007

  • 7/31/2019 pr_report

    8/11

    CMPE264: Image Analysis and Computer Vision

    identical values. Thus allowing us to reduce the size of the region scanned

    and noticeably improving the frame rate output to 7 frames per second forthe analyzed footage.

    Figure 7: Motion Vector Components for Different Values ofN

    Using 56 56 regions, there existed the need to prove that the the sta-bilization was working correctly. Therefore a small Simulink model, shownin Figure 8 was set up in order to generate difference frames such that:

    Fd = Ft Ft1, (7)

    for both, the original footage and the compensated footage. Figure 9 showsthree difference frames for a given sequence, showing that the compensationindeed does much better than the original video. Figure 10 shows the meanof each difference frame Fd for a segment of video footage. It is clear fromthat figure that the compensated video does better in most of the cases than

    that of the original video.

    Final Project Report 7 August 15, 2007

  • 7/31/2019 pr_report

    9/11

    CMPE264: Image Analysis and Computer Vision

    Figure 8: Simulink Model of the Frame Difference Comparison

    4 Conclusion

    From the implementation of the previously described image stabilization al-gorithm one can conclude the following:

    The GCBP method described in [2] and [3] show good performance instabilizing video footage that contains significant intentional rotationand panning as was the case with the UAV footage.

    The significance of the results for the motion vectors mentioned in Sec-tion 3 is that decreasing the block size also increases speed of the motioncompensation implementation. Therefore, the results above show thatwe can run the model with a block size of roughly 56 56 pixels andstill attain the same level of quality as running the model with a largerblock size, but at a faster speed with reduced computational costs.

    The use of Simulink to implement the algorithm offered a great insightinto each step of the algorithm and allowed to test and debug eachfunctional block independently.

    5 Acknowledgments

    The authors of this Final Project would like to thank Dr. Vladimir Dobrok-hodov from the Naval Postgraduate School Unmanned Systems Lab for provid-

    Final Project Report 8 August 15, 2007

  • 7/31/2019 pr_report

    10/11

    CMPE264: Image Analysis and Computer Vision

    Figure 9: Difference Frames at different Time Intervals

    ing us with several hours of UAV footage and invaluable support in the Sim-ulink implementation of this algorithm.

    References[1] J. Bergen, P. Anandan, K. Hanna, and R. Hingorani, Hierarchical Model-

    Based Motion Estimation, David Sarnoff Research Center, Princeton, NJ,1992

    [2] S. Ko, S. Lee, S. Jeon, E. Kang, Fast Digital Image Stabilizer Basedon Gray-Coded Bit Plane Matching, IEEE Transactions on ConsumerElectronics, Vol 45, No.3 August 1999.

    [3] A. Brooks, Real-Time Digital Image Stabilization, Image Processing Re-port, Department of Electrical Engineering, Northwestern University,

    Evanston, IL, 2003.

    [4] Gray Code , Wikipedia, the free encyclopedia,http://en.wikipedia.org/wiki/Gray code, November, 2006.

    Final Project Report 9 August 15, 2007

  • 7/31/2019 pr_report

    11/11

    CMPE264: Image Analysis and Computer Vision

    Figure 10: Mean of Difference Frames for a Segment of Video Footage

    Final Project Report 10 August 15, 2007