1
Low Light Video Processing on Mobile Devices Patrick Martinchek 1 , Nobie Redmon 2 , Imran Thobani 3 1 Graduate School of Business, Stanford University 2 Department of Applied Physics, Stanford University 3 Department of Electrical Engineering, Stanford University Motivation Dealing with Low-Light GPU-Accelerated Computation Experimental Results Low contrast and noise remains a barrier to visually pleasing videos in low light conditions. Recording appealing video at concerts, social gatherings, and in security monitoring situations is still an unsolved problem, with many groups searching for a solution. Our target application for this research is finding a better software solution for mobile low-light video, particularly in concert venues. Although mobile image processing products such as Instagram create nice-looking photos after applying overlay filters, these simple processing algorithms fail to create nicer low- light image results. Here we explore mobile video enhancement via advanced histogram equalization (HE) and denoising via probabilistic temporal averaging through a Hidden Markov Model (HMM). Uncorrected low-light concert video frame Low-light corrected, noisy concert video frame Contrast enhancement with denoising Motion Detector Traditional Histogram Equalization Implemented Algorithm Original Image Parallelize complete algorithm for use on GPU Implement Video Stabilization Future Work References Train HMM Transition Matrix Implement Online Learning Model direction of motion Static Denoising Iphone Implementation Because histogram equalization is a pixel-wise operation, we can accelerate these calculations using the onboard iPhone 5S GPU. About 75% of our mobile solution executes on the GPU, which is fast enough for our algorithms to operate at video frame rates with an iPhone 5S device. To make our development faster, we used an iOS framework called GPUImage. The GPUImage library uses OpenGL ES 2.0 shaders to perform image and video processing significantly faster than CPU-based routines, and the framework provides a simple, lightweight Objective-C interface into the complex OpenGL ES API. With this interface, we are able to customize our own shader programs using the Graphic Library Shader Language (GLSL), which controls how each pixel value is determined, calculated, and displayed on the GPU. Probabilistic Denoising A. Piˇzurica, V. Zlokolica and W. Philips. Noise Reduction in Video Sequences Using Wavelet-Domain and Temporal Filtering. Proc. SPIE 5266, Wavelet Applications in Industrial Processing, 48 (Feb 27, 2004) doi:10.1117/12.516069 Imabe Desnoising Algorithms Archive. http://www5.cs.fau.de/ research/software/idaa G. Toderici, J.Yagnik: Automatic, efficient, temporally-coherent video enhancement for large scale applications. ACM Multimedia 2009: 609-612 Brad Larson. GPUimage. http://www.github.com/BradLarson/ GPUimage Temporally Coherent Contrast Enhancement Probabilistic Denoising via HMM Traditional HE is very effective at increasing contrast in individual images. However, when applied to videos several pathological examples arise; flickering due to changes in illumination, enhanced noise, and blocking are a few cases. The probability density function of an image is given by; From the PDF we calculate the cumulative density function and scale the value component (of HSV). P(k ) = 1 k PDF = P(k ) = Pr( I ( x, y) = k ) where CDF = C(k ) = P(i ) i=0 k=N1 l out = l in b est w est b est (w d b d ) + b d Scaling l out at each frame leads to unwanted flickering. The estimators (w est , b est ) are temporally averaged. For the case of high background/foreground contrast we scale l out : l out ' = l in δ l γ l for δ l < 0 l out otherwise " # $ $ % $ $ where δ l = l out l in Conventional denoising algorithms temporally average individual pixels. However for videos with motion it is important to neglect pixels in motion to avoid motion artifacts. Here we implement a hidden Markov model (HMM) to assign a motion probability to individual pixels and weighting the average accordingly. A Hidden Markov Model: x = hidden variable, z = observed variable Each pixel has its own HMM, with each time k corresponding to a video frame. In the figure above, the hidden variable z k is a 3-vector representing the absolute difference between the pixel’s RGB values at frame k and the running average of the pixel’s RGB values up to frame k-1. The hidden variable x k is chosen to be a Bernoulli variable representing whether the difference z k is caused by significant motion occurring at the pixel as opposed to random noise. After hand tuning the 2x2 transition matrix A and initial probability P(x 1 =1) to reasonable values, we are able to efficiently perform HMM filtering on any video to compute m k = P(x k | z 1:k ) for each pixel and each frame k. m k represents the model’s belief that motion is occurring at the pixel at time k. We then set the pixel’s RGB values at frame k, a 3-vector q k , to be a weighted average of its current RGB values p k and the running average a k of its RGB values as follows: Incorporating m k into the calculation of running average a k effectively resets the running average to the current RGB values whenever the model believes that motion occurred at a pixel. q k = m k p k + (1 m k )a k1 a k = m k p k + (1 m k )(β a k + (1 β )a k1 ) Benchmarks Applying a gamma filter to a single video frame CPU Processing Core Image GPUimage 24.80ms 3.66ms 1.30ms CPU only GPU+CPU (Native iOS) Pure GPU computation

Low Light Video Processing on Mobile Devices - Stacksfw095wd5869/Martinchek... · Low Light Video Processing on Mobile Devices! Patrick Martinchek1, Nobie Redmon2, Imran Thobani3!

Embed Size (px)

Citation preview

Page 1: Low Light Video Processing on Mobile Devices - Stacksfw095wd5869/Martinchek... · Low Light Video Processing on Mobile Devices! Patrick Martinchek1, Nobie Redmon2, Imran Thobani3!

Low Light Video Processing on Mobile Devices!Patrick Martinchek1, Nobie Redmon2, Imran Thobani3!

1Graduate School of Business, Stanford University!2Department of Applied Physics, Stanford University!

3Department of Electrical Engineering, Stanford University

Motivation Dealing with Low-Light

GPU-Accelerated Computation Experimental Results

Low contrast and noise remains a barrier to visually pleasing videos in low light conditions. Recording appealing video at concerts, social gatherings, and in security monitoring situations is still an unsolved problem, with many groups searching for a solution. Our target application for this research is finding a better software solution for mobile low-light video, particularly in concert venues. Although mobile image processing products such as Instagram create nice-looking photos after applying overlay filters, these simple processing algorithms fail to create nicer low-light image results. Here we explore mobile video enhancement via advanced histogram equalization (HE) and denoising via probabilistic temporal averaging through a Hidden Markov Model (HMM).

Uncorrected low-light concert video frame Low-light corrected, noisy concert video frame

Contrast enhancement with denoising

Motion Detector

Traditional Histogram Equalization

Implemented Algorithm Original Image

•  Parallelize complete algorithm for use on GPU •  Implement Video Stabilization

Future Work!

References!

•  Train HMM Transition Matrix •  Implement Online Learning •  Model direction of motion •  Static Denoising

Iphone Implementation Because histogram equalization is a pixel-wise operation, we can accelerate these calculations using the onboard iPhone 5S GPU. About 75% of our mobile solution executes on the GPU, which is fast enough for our algorithms to operate at video frame rates with an iPhone 5S device. To make our development faster, we used an iOS framework called GPUImage. The GPUImage library uses OpenGL ES 2.0 shaders to perform image and video processing significantly faster than CPU-based routines, and the framework provides a simple, lightweight Objective-C interface into the complex OpenGL ES API. With this interface, we are able to customize our own shader programs using the Graphic Library Shader Language (GLSL), which controls how each pixel value is determined, calculated, and displayed on the GPU.

Probabilistic Denoising

A. Piˇzurica, V. Zlokolica and W. Philips. Noise Reduction in Video Sequences Using Wavelet-Domain and Temporal Filtering. Proc. SPIE 5266, Wavelet Applications in Industrial Processing, 48 (Feb 27, 2004) doi:10.1117/12.516069 Imabe Desnoising Algorithms Archive. http://www5.cs.fau.de/research/software/idaa G. Toderici, J.Yagnik: Automatic, efficient, temporally-coherent video enhancement for large scale applications. ACM Multimedia 2009: 609-612 Brad Larson. GPUimage. http://www.github.com/BradLarson/GPUimage

Temporally Coherent Contrast Enhancement

Probabilistic Denoising via HMM

Traditional HE is very effective at increasing contrast in individual images. However, when applied to videos several pathological examples arise; flickering due to changes in illumination, enhanced noise, and blocking are a few cases.

The probability density function of an image is given by;

From the PDF we calculate the cumulative density function and scale the value component (of HSV).

P(k) =1k∑PDF = P(k) = Pr(I(x, y) = k)where

CDF =C(k) = P(i)i=0

k=N−1

∑ lout =lin − bestwest − best

(wd − bd )+ b d

Scaling lout at each frame leads to unwanted flickering. The estimators (west, best) are temporally averaged.

For the case of high background/foreground contrast we scale lout:

lout' =

lin − δlγl for δl < 0

lout otherwise

"

#$$

%$$

where δl = lout − lin

Conventional denoising algorithms temporally average individual pixels. However for videos with motion it is important to neglect pixels in motion to avoid motion artifacts. Here we implement a hidden Markov model (HMM) to assign a motion probability to individual pixels and weighting the average accordingly.

A Hidden Markov Model: x = hidden variable, z = observed variable Each pixel has its own HMM, with each time k corresponding to a video frame. In the figure above, the hidden variable zk is a 3-vector representing the absolute difference between the pixel’s RGB values at frame k and the running average of the pixel’s RGB values up to frame k-1. The hidden variable xk is chosen to be a Bernoulli variable representing whether the difference zk is caused by significant motion occurring at the pixel as opposed to random noise. After hand tuning the 2x2 transition matrix A and initial probability P(x1=1) to reasonable values, we are able to efficiently perform HMM filtering on any video to compute mk = P(xk | z1:k) for each pixel and each frame k. mk represents the model’s belief that motion is occurring at the pixel at time k. We then set the pixel’s RGB values at frame k, a 3-vector qk, to be a weighted average of its current RGB values pk and the running average ak of its RGB values as follows: Incorporating mk into the calculation of running average ak effectively resets the running average to the current RGB values whenever the model believes that motion occurred at a pixel.

qk =mk pk + (1−mk )ak−1ak =mk pk + (1−mk )(βak + (1−β)ak−1)

Benchmarks Applying a gamma filter to a single video frame

CPU Processing Core Image GPUimage

24.80ms 3.66ms 1.30ms

CPU only GPU+CPU (Native iOS) Pure GPU computation