54
Machine Learning-Based Coding Unit Depth Decisions for Flexible Complexity Allocation in High Efficiency Video Coding In this paper, we propose a machine learning-based fast coding unit (CU) depth decision method for High Efficiency Video Coding (HEVC), which optimizes the complexity allocation at CU level with given rate-distortion (RD) cost constraints. First, we analyze quad-tree CU depth decision process in HEVC and model it as a three-level of hierarchical binary decision problem. Second, a flexible CU depth decision structure is presented, which allows the performances of each CU depth decision be smoothly transferred between the coding complexity and RD performance. Then, a three-output joint classifier consists of multiple binary classifiers with different parameters is designed to control the risk of false prediction. Finally, a sophisticated RD-complexity model is derived to determine the optimal parameters for the joint classifier, which is capable of minimizing the complexity in each CU depth at given RD degradation constraints. Comparative experiments over various sequences show that the proposed CU depth decision algorithm can reduce the computational complexity from 28.82% to 70.93%, and 51.45% on average when compared with the original HEVC test model. The Bjøntegaard delta peak signal- to-noise ratio and Bjøntegaard delta bit rate are -0.061 dB and 1.98% on average, which is negligible. The overall performance of the proposed algorithm outperforms those of the state-of-the-art schemes. Distinguishing Local and Global Edits for Their Simultaneous Propagation in a Uniform Framework

Matlab abstract 2016

Embed Size (px)

Citation preview

Machine Learning-Based Coding Unit Depth Decisions for Flexible Complexity Allocationin High Efficiency Video Coding

In this paper, we propose a machine learning-based fast coding unit (CU) depth decision method for High Efficiency Video Coding (HEVC), which optimizes the complexity allocation at CU level with given rate-distortion (RD) cost constraints. First, we analyze quad-tree CU depth decision process in HEVC and model it as a three-level of hierarchical binary decision problem. Second, a flexible CU depth decision structure is presented, which allows the performances of each CU depth decision be smoothly transferred between the coding complexity and RD performance. Then, a three-output joint classifier consists of multiple binary classifiers with different parameters is designed to control the risk of false prediction. Finally, a sophisticated RD-complexity model is derived to determine the optimal parameters for the joint classifier, which is capable of minimizing the complexity in each CU depth at given RD degradation constraints. Comparative experiments over various sequences show that the proposed CU depth decision algorithm can reduce the computational complexity from 28.82% to 70.93%, and 51.45% on average when compared with the original HEVC test model. The Bjøntegaard delta peak signal-to-noise ratio and Bjøntegaard delta bit rate are -0.061 dB and 1.98% on average, which is negligible. The overall performance of the proposed algorithm outperforms those of the state-of-the-art schemes.

Distinguishing Local and Global Edits for Their Simultaneous Propagation in a Uniform Framework

In propagating edits for image editing, some edits are intended to affect limited local regions, while others act globally over the entire image. However, the ambiguity problem in propagating edits is not adequately addressed in existing methods. Thus, tedious user input requirements remain since the user must densely or repeatedly input control samples to suppress ambiguity. In this paper, we address this challenge to propagate edits suitably by marking edits for local or global propagation and determining their reasonable propagation scopes automatically. Thus, our approach avoids propagation conflicts, effectively resolving the ambiguity problem. With the reduction of ambiguity, our method allows fewer and less-precise control samples than existing methods. Furthermore, we provide a uniform framework to propagate local and global edits simultaneously, helping the user to quickly obtain the intended results with reduced labor. With our unified framework, the potentially ambiguous interaction between local and global edits

(evident in existing methods that propagate these two edit types in series) is resolved. We experimentally demonstrate the effectiveness of our method compared with existing methods.

Face Recognition Across Non-Uniform Motion Blur, Illumination, and Pose

Existing methods for performing face recognition in the presence of blur are based on the convolution model and cannot handle non-uniform blurring situations that frequently arise from tilts and rotations in hand-held cameras. In this paper, we propose a methodology for face recognition in the presence of space-varying motion blur comprising of arbitrarily-shaped kernels. We model the blurred face as a convex combination of geometrically transformed instances of the focused gallery face, and show that the set of all images obtained by non-uniformly blurring a given image forms a convex set. We first propose a nonuniform blur-robust algorithm by making use of the assumption of a sparse camera trajectory in the camera motion space to build an energy function with l1-norm constraint on the camera motion. The framework is then extended to handle illumination variations by exploiting the fact that the set of all images obtained from a face image by non-uniform blurring and changing the illumination forms a bi-convex set. Finally, we propose an elegant extension to also account for variations in pose.

Swarm Intelligence for Detecting Interesting Events in Crowded Environments

This paper focuses on detecting and localizing anomalous events in videos of crowded scenes, i.e., divergences from a dominant pattern. Both motion and appearance information are considered, so as to robustly distinguish different kinds of anomalies, for a wide range of scenarios. A newly introduced concept based on swarm theory, histograms of oriented swarms (HOS), is applied to capture the dynamics of crowded environments. HOS, together with the well-known histograms of oriented gradients, are combined to build a descriptor that effectively characterizes each scene. These appearance and motion features are only extracted within spatiotemporal volumes of moving pixels to ensure robustness to local noise, increase accuracy in the detection of local, nondominant anomalies, and achieve a lower computational cost. Experiments on benchmark data sets containing various situations with human crowds, as well as on traffic data, led to results that surpassed the current state of the art (SoA), confirming the method's efficacy and generality. Finally, the experiments show that our approach achieves significantly higher accuracy, especially for pixel-level event detection compared to SoA methods, at a low computational cost.

Content-Based Image Retrieval Using FeaturesExtracted From Halftoning-Based BlockTruncation Coding

This paper presents a technique for content-based image retrieval (CBIR) by exploiting the advantage of low-complexity ordered-dither block truncation coding (ODBTC) for the generation of image content descriptor. In the encoding step, ODBTC compresses an image block into corresponding quantizers and bitmap image. Two image features are proposed to index an image, namely, color co-occurrence feature (CCF) and bit pattern features (BPF), which are generated directly from the ODBTC encoded data streams without performing the decoding process. The CCF and BPF of an image are simply derived from the two ODBTC quantizers and bitmap, respectively, by involving the visual codebook. Experimental results show that the proposed method is superior to the block truncation coding image retrieval systems and the other earlier methods, and thus prove that the ODBTC scheme is not only suited for image compression, because of its simplicity, but also offers a simple and effective descriptor to index images in CBIR system.

High-Resolution Face Verification UsingPore-Scale Facial Features

Face recognition methods, which usually represent face images using holistic or local facial features, rely heavily on alignment. Their performances also suffer a severe degradation under variations in expressions or poses, especially when there is one gallery per subject only. With the easy access to high-resolution (HR) face images nowadays, some HR face databases have recently been developed. However, few studies have tackled the use of HR information for face recognition or verification. In this paper, we propose a pose-invariant face-verification method, which is robust to alignment errors, using the HR information based on pore-scale facial features. A new keypoint descriptor, namely, pore-Principal Component Analysis (PCA)-Scale Invariant Feature Transform (PPCASIFT)-adapted from PCA-SIFT-is devised for the extraction of a compact set of distinctive pore-scale facial features. Having matched the pore-scale features of two-face regions, an effective robust-fitting scheme is proposed for the face-verification task. Experiments show that, with one frontal-view gallery only per subject, our proposed method outperforms a number of standard verification methods, and can achieve excellent accuracy even the faces are under large variations in expression and pose.

DERF: Distinctive Efficient Robust Features Fromthe Biological Modeling of the P Ganglion Cells

Studies in neuroscience and biological vision have shown that the human retina has strong computational power, and its information representation supports vision tasks on both ventral and dorsal pathways. In this paper, a new local image descriptor, termed distinctive efficient robust features (DERF), is derived by modeling the response and distribution properties of the parvocellular-projecting ganglion cells in the primate retina. DERF features exponential scale distribution, exponential grid structure, and circularly symmetric function difference of Gaussian (DoG) used as a convolution kernel, all of which are consistent with the characteristics of the ganglion cell array found in neurophysiology, anatomy, and biophysics. In addition, a new explanation for local descriptor design is presented from the perspective of wavelet tight frames. DoG is naturally a wavelet, and the structure of the grid points array in our descriptor is closely related to the spatial sampling of wavelets. The DoG wavelet itself forms a frame, and when we modulate the parameters of our descriptor to make the frame tighter, the performance of the DERF descriptor improves accordingly. This is verified by designing a tight frame DoG, which leads to much better performance. Extensive experiments conducted in the image matching task on the multiview stereo correspondence data set demonstrate that DERF outperforms state of the art methods for both hand-crafted and learned descriptors, while remaining robust and being much faster to compute.

Blind Inpainting using ℓ0 and Total VariationRegularization

In this paper, we address the problem of image reconstruction with missing pixels or corrupted with impulse noise, when the locations of the corrupted pixels are not known. A logarithmic transformation is applied to convert the multiplication between the image and binary mask into an additive problem. The image and mask terms are then estimated iteratively with total variation regularization applied on the image, and ℓ0 regularization on the mask term which imposes sparseness on the support set of the missing pixels. The resulting alternating minimization scheme simultaneously estimates the image and mask, in the same iterative process. The logarithmic transformation also allows the method to be extended to the Rayleigh multiplicative

and Poisson observation models. The method can also be extended to impulse noise removal by relaxing the regularizer from the ℓ0 norm to the ℓ1 norm. Experimental results show that the proposed method can deal with a larger fraction of missing pixels than two phase methods, which first estimate the mask and then reconstruct the image.

A Source-Channel Coding Approach to DigitalImage Protection and Self-Recovery

Watermarking algorithms have been widely applied to the field of image forensics recently. One of these very forensic applications is the protection of images against tampering. For this purpose, we need to design a watermarking algorithm fulfilling two purposes in case of image tampering: 1) detecting the tampered area of the received image and 2) recovering the lost information in the tampered zones. State-of-the-art techniques accomplish these tasks using watermarks consisting of check bits and reference bits. Check bits are used for tampering detection, whereas reference bits carry information about the whole image. The problem of recovering the lost reference bits still stands. This paper is aimed at showing that having the tampering location known, image tampering can be modeled and dealt with as an erasure error. Therefore, an appropriate design of channel code can protect the reference bits against tampering. In the present proposed method, the total watermark bit-budget is dedicated to three groups: 1) source encoder output bits; 2) channel code parity bits; and 3) check bits. In watermark embedding phase, the original image is source coded and the output bit stream is protected using appropriate channel encoder. For image recovery, erasure locations detected by check bits help channel erasure decoder to retrieve the original source encoded image. Experimental results show that our proposed scheme significantly outperforms recent techniques in terms of image quality for both watermarked and recovered image. The watermarked image quality gain is achieved through spending less bit-budget on watermark, while image recovery quality is considerably improved as a consequence of consistent performance of designed source and channel codes.

Structured Sparse Priors for Image Classification

Model-based compressive sensing (CS) exploits the structure inherent in sparse signals for the design of better signal recovery algorithms. This information about structure is often captured in the form of a prior on the sparse coefficients, the Laplacian being the most common such choice (leading to l1-norm minimization). The recent seminal contribution by Wright et al. exploits the

discriminative capability of sparse representations for image classification, specifically face recognition. Their approach employs the analytical framework of CS with class-specific dictionaries. Our contribution is a logical extension of these ideas into structured sparsity for classification. We use class-specific dictionaries in conjunction with discriminative class-specific priors, specifically the spike-and-slab prior widely applied in Bayesian regression. Significantly, the proposed framework takes the burden off the demand for abundant training image samples necessary for the success of sparsity-based classification schemes.

Video Tracking Using Learned Hierarchical Features

In this paper, we propose an approach to learn hierarchical features for visual object tracking. First, we offline learn features robust to diverse motion patterns from auxiliary video sequences. The hierarchical features are learned via a two-layer convolutional neural network. Embedding the temporal slowness constraint in the stacked architecture makes the learned features robust to complicated motion transformations, which is important for visual object tracking. Then, given a target video sequence, we propose a domain adaptation module to online adapt the pre-learned features according to the specific target object. The adaptation is conducted in both layers of the deep feature learning module so as to include appearance information of the specific target object. As a result, the learned hierarchical features can be robust to both complicated motion transformations and appearance changes of target objects. We integrate our feature learning algorithm into three tracking methods. Experimental results demonstrate that significant improvement can be achieved using our learned hierarchical features, especially on video sequences with complicated motion transformations.

A Global/Local Affinity Graph for ImageSegmentation

Construction of a reliable graph capturing perceptual grouping cues of an image is fundamental for graph-cut based image segmentation methods. In this paper, we propose a novel sparse global/local affinity graph over superpixels of an input image to capture both short- and long-range grouping cues, and thereby enabling perceptual grouping laws, including proximity, similarity, continuity, and to enter in action through a suitable graph-cut algorithm. Moreover, we also evaluate three major visual features, namely, color, texture, and shape, for their effectiveness in perceptual segmentation and propose a simple graph fusion scheme to

implement some recent findings from psychophysics, which suggest combining these visual features with different emphases for perceptual grouping. In particular, an input image is first oversegmented into superpixels at different scales. We postulate a gravitation law based on empirical observations and divide superpixels adaptively into small-, medium-, and large-sized sets. Global grouping is achieved using medium-sized superpixels through a sparse representation of superpixels' features by solving a ℓ0-minimization problem, and thereby enabling continuity or propagation of local smoothness over long-range connections. Small- and large-sized superpixels are then used to achieve local smoothness through an adjacent graph in a given feature space, and thus implementing perceptual laws, for example, similarity and proximity. Finally, a bipartite graph is also introduced to enable propagation of grouping cues between superpixels of different scales. Extensive experiments are carried out on the Berkeley segmentation database in comparison with several state-of-the-art graph constructions. The results show the effectiveness of the proposed approach, which outperforms state-of-the-art graphs using four different objective criteria, namely, the probabilistic rand index, the variation of information, the global consistency e- ror, and the boundary displacement error.

An Efficient MRF Embedded Level Set Method forImage Segmentation

This paper presents a fast and robust level set method for image segmentation. To enhance the robustness against noise, we embed a Markov random field (MRF) energy function to the conventional level set energy function. This MRF energy function builds the correlation of a pixel with its neighbors and encourages them to fall into the same region. To obtain a fast implementation of the MRF embedded level set model, we explore algebraic multigrid (AMG) and sparse field method (SFM) to increase the time step and decrease the computation domain, respectively. Both AMG and SFM can be conducted in a parallel fashion, which facilitates the processing of our method for big image databases. By comparing the proposed fast and robust level set method with the standard level set method and its popular variants on noisy synthetic images, synthetic aperture radar (SAR) images, medical images, and natural images, we comprehensively demonstrate the new method is robust against various kinds of noises. In particular, the new level set method can segment an image of size 500 × 500 within 3 s on MATLAB R2010b installed in a computer with 3.30-GHz CPU and 4-GB memory.

Weighted Guided Image Filtering

It is known that local filtering-based edge preserving smoothing techniques suffer from halo artifacts. In this paper, a weighted guided image filter (WGIF) is introduced by incorporating an edge-aware weighting into an existing guided image filter (GIF) to address the problem. The WGIF inherits advantages of both global and local smoothing filters in the sense that: 1) the complexity of the WGIF is O(N) for an image with N pixels, which is same as the GIF and 2) the WGIF can avoid halo artifacts like the existing global smoothing filters. The WGIF is applied for single image detail enhancement, single image haze removal, and fusion of differently exposed images. Experimental results show that the resultant algorithms produce images with better visual quality and at the same time halo artifacts can be reduced/avoided from appearing in the final images with negligible increment on running times.

Multi-task Pose-Invariant Face Recognition

Face images captured in unconstrained environments usually contain significant pose variation, which dramatically degrades the performance of algorithms designed to recognize frontal faces. This paper proposes a novel face identification framework capable of handling the full range of pose variations within ±90° of yaw. The proposed framework first transforms the original pose-invariant face recognition problem into a partial frontal face recognition problem. A robust patch-based face representation scheme is then developed to represent the synthesized partial frontal faces. For each patch, a transformation dictionary is learnt under the proposed multi-task learning scheme. The transformation dictionary transforms the features of different poses into a discriminative subspace. Finally, face matching is performed at patch level rather than at the holistic level. Extensive and systematic experimentation on FERET, CMU-PIE, and Multi-PIE databases shows that the proposed method consistently outperforms single-task-based baselines as well as state-of-the-art methods for the pose problem. We further extend the proposed algorithm for the unconstrained face verification problem and achieve top-level performance on the challenging LFW data set.

A Feature-Enriched Completely Blind ImageQuality Evaluator

Existing blind image quality assessment (BIQA) methods are mostly opinion-aware. They learn regression models from training images with associated human subjective scores to predict the perceptual quality of test images. Such opinion-aware methods, however, require a large amount of training samples with associated human subjective scores and of a variety of distortion types.

The BIQA models learned by opinion-aware methods often have weak generalization capability, hereby limiting their usability in practice. By comparison, opinion-unaware methods do not need human subjective scores for training, and thus have greater potential for good generalization capability. Unfortunately, thus far no opinion-unaware BIQA method has shown consistently better quality prediction accuracy than the opinion-aware methods. Here, we aim to develop an opinion-unaware BIQA method that can compete with, and perhaps outperform, the existing opinion-aware methods. By integrating the features of natural image statistics derived from multiple cues, we learn a multivariate Gaussian model of image patches from a collection of pristine natural images. Using the learned multivariate Gaussian model, a Bhattacharyya-like distance is used to measure the quality of each image patch, and then an overall quality score is obtained by average pooling. The proposed BIQA method does not need any distorted sample images nor subjective quality scores for training, yet extensive experiments demonstrate its superior quality-prediction performance to the state-of-the-art opinion-aware BIQA methods. The MATLAB source code of our algorithm is publicly available at www.comp.polyu.edu.hk/~cslzhang/IQA/ILNIQE/ILNIQE.htm.

Spatiotemporal Saliency Detection for VideoSequences Based on Random Walk With Restart

A novel saliency detection algorithm for video sequences based on the random walk with restart (RWR) is proposed in this paper. We adopt RWR to detect spatially and temporally salient regions. More specifically, we first find a temporal saliency distribution using the features of motion distinctiveness, temporal consistency, and abrupt change. Among them, the motion distinctiveness is derived by comparing the motion profiles of image patches. Then, we employ the temporal saliency distribution as a restarting distribution of the random walker. In addition, we design the transition probability matrix for the walker using the spatial features of intensity, color, and compactness. Finally, we estimate the spatiotemporal saliency distribution by finding the steady-state distribution of the walker. The proposed algorithm detects foreground salient objects faithfully, while suppressing cluttered backgrounds effectively, by incorporating the spatial transition matrix and the temporal restarting distribution systematically. Experimental results on various video sequences demonstrate that the proposed algorithm outperforms conventional saliency detection algorithms qualitatively and quantitatively.

Sorted Consecutive Local Binary Patternfor Texture Classification

In this paper, we propose a sorted consecutive local binary pattern (scLBP) for texture classification. Conventional methods encode only patterns whose spatial transitions are not more than two, whereas scLBP encodes patterns regardless of their spatial transition. Conventional methods do not encode patterns on account of rotation-invariant encoding; on the other hand, patterns with more than two spatial transitions have discriminative power. The proposed scLBP encodes all patterns with any number of spatial transitions while maintaining their rotation-invariant nature by sorting the consecutive patterns. In addition, we introduce dictionary learning of scLBP based on kd-tree which separates data with a space partitioning strategy. Since the elements of sorted consecutive patterns lie in different space, it can be generated to a discriminative code with kd-tree. Finally, we present a framework in which scLBPs and the kd-tree can be combined and utilized. The results of experimental evaluation on five texture data sets-Outex, CUReT, UIUC, UMD, and KTH-TIPS2-a-indicate that our proposed framework achieves the best classification rate on the CUReT, UMD, and KTH-TIPS2-a data sets compared with conventional methods. The results additionally indicate that only a marginal difference exists between the best classification rate of conventional methods and that of the proposed framework on the UIUC and Outex data sets.

Robust 2D Principal Component Analysis:A Structured Sparsity Regularized Approach

Principal component analysis (PCA) is widely used to extract features and reduce dimensionality in various computer vision and image/video processing tasks. Conventional approaches either lack robustness to outliers and corrupted data or are designed for one-dimensional signals. To address this problem, we propose a robust PCA model for two-dimensional images incorporating structured sparse priors, referred to as structured sparse 2D-PCA. This robust model considers the prior of structured and grouped pixel values in two dimensions. As the proposed formulation is jointly nonconvex and nonsmooth, which is difficult to tackle by joint optimization, we develop a two-stage alternating minimization approach to solve the problem. This approach iteratively learns the projection matrices by bidirectional decomposition and utilizes the proximal method to obtain the structured sparse outliers. By considering the structured sparsity prior, the proposed model becomes less sensitive to noisy data and outliers in two dimensions. Moreover, the computational cost indicates that the robust two-dimensional model is capable of processing quarter common intermediate format video in real time, as well as handling large-size images and videos, which is often intractable with other robust PCA approaches that involve image-to-vector conversion. Experimental results on robust face reconstruction, video background subtraction data set, and real-world videos show the effectiveness of the proposed model compared with conventional 2D-PCA and other robust PCA algorithms.

Accurate Vessel Segmentation WithConstrained B-Snake

We describe an active contour framework with accurate shape and size constraints on the vessel cross-sectional planes to produce the vessel segmentation. It starts with a multiscale vessel axis tracing in a 3D computed tomography (CT) data, followed by vessel boundary delineation on the cross-sectional planes derived from the extracted axis. The vessel boundary surface is deformed under constrained movements on the cross sections and is voxelized to produce the final vascular segmentation. The novelty of this paper lies in the accurate contour point detection of thin vessels based on the CT scanning model, in the efficient implementation of missing contour points in the problematic regions and in the active contour model with accurate shape and size constraints. The main advantage of our framework is that it avoids disconnected and incomplete segmentation of the vessels in the problematic regions that contain touching vessels (vessels in close proximity to each other), diseased portions (pathologic structure attached to a vessel), and thin vessels. It is particularly suitable for accurate segmentation of thin and low contrast vessels. Our method is evaluated and demonstrated on CT data sets from our partner site, and its results are compared with three related methods. Our method is also tested on two publicly available databases and its results are compared with the recently published method. The applicability of the proposed method to some challenging clinical problems, the segmentation of the vessels in the problematic regions, is demonstrated with good results on both quantitative and qualitative experimentations; our segmentation algorithm can delineate vessel boundaries that have level of variability similar to those obtained manually.

PatchMatch With Potts Model for ObjectSegmentation and Stereo Matching

This paper presents a unified variational formulation for joint object segmentation and stereo matching, which takes both accuracy and efficiency into account. In our approach, depth-map consists of compact objects, each object is represented through three different aspects: the perimeter in image space; the slanted object depth plane; and the planar bias, which is to add an additional level of detail on top of each object plane in order to model depth variations within an object. Compared with traditional high quality solving methods in low level, we use a convex formulation of the multilabel Potts Model with PatchMatch stereo techniques to generate depth-map at each image in object level and show that accurate multiple view reconstruction can be achieved with our formulation by means of induced homography without discretization or

staircasing artifacts. Our model is formulated as an energy minimization that is optimized via a fast primal-dual algorithm, which can handle several hundred object depth segments efficiently. Performance evaluations in the Middlebury benchmark data sets show that our method outperforms the traditional integer-valued disparity strategy as well as the original PatchMatch algorithm and its variants in subpixel accurate disparity estimation. The proposed algorithm is also evaluated and shown to produce consistently good results for various real-world data sets (KITTI benchmark data sets and multiview benchmark data sets).

Robust Representation and Recognition ofFacial Emotions Using Extreme SparseLearning

Recognition of natural emotions from human faces is an interesting topic with a wide range of potential applications, such as human-computer interaction, automated tutoring systems, image and video retrieval, smart environments, and driver warning systems. Traditionally, facial emotion recognition systems have been evaluated on laboratory controlled data, which is not representative of the environment faced in real-world applications. To robustly recognize the facial emotions in real-world natural situations, this paper proposes an approach called extreme sparse learning, which has the ability to jointly learn a dictionary (set of basis) and a nonlinear classification model. The proposed approach combines the discriminative power of extreme learning machine with the reconstruction property of sparse representation to enable accurate classification when presented with noisy signals and imperfect data recorded in natural settings. In addition, this paper presents a new local spatio-temporal descriptor that is distinctive and pose-invariant. The proposed framework is able to achieve the state-of-the-art recognition accuracy on both acted and spontaneous facial emotion databases.

Adaptive Image Denoising by Targeted Databases

We propose a data-dependent denoising procedure to restore noisy images. Different from existing denoising algorithms which search for patches from either the noisy image or a generic database, the new algorithm finds patches from a database that contains only relevant patches. We formulate the denoising problem as an optimal filter design problem and make two contributions. First, we determine the basis function of the denoising filter by solving a group sparsity minimization problem. The optimization formulation generalizes existing denoising algorithms and offers systematic analysis of the performance. Improvement methods are

proposed to enhance the patch search process. Second, we determine the spectral coefficients of the denoising filter by considering a localized Bayesian prior. The localized prior leverages the similarity of the targeted database, alleviates the intensive Bayesian computation, and links the new method to the classical linear minimum mean squared error estimation. We demonstrate applications of the proposed method in a variety of scenarios, including text images, multiview images and face images. Experimental results show the superiority of the new algorithm over existing methods.

Progressive Halftone Watermarking UsingMulti-layer Table Lookup Strategy

In this paper, a halftoning-based multilayer watermarking of low computational complexity is proposed. An additional data-hiding technique is also employed to embed multiple watermarks into the watermark to be embedded to improve the security and embedding capacity. At the encoder, the efficient direct binary search method is employed to generate 256 reference tables to ensure the output is in halftone format. Subsequently, watermarks are embedded by a set of optimized compressed tables with various textural angles for table lookup. At the decoder, the least mean square metric is considered to increase the differences among those generated phenotypes of the embedding angles and reduce the required number of dimensions for each angle. Finally, the naive Bayes classifier is employed to collect the possibilities of multilayer information for classifying the associated angles to extract the embedded watermarks. These decoded watermarks can be further overlapped for retrieving the additional hidden-layer watermarks. Experimental results show that the proposed method requires only 8.4 ms for embedding a watermark into an image of size 512 × 512, under the 32-bit Windows 7 platform running on 4GB RAM, Intel core i7 Sandy Bridge with 4GB RAM and IDE Visual Studio 2010. Finally, only 2 MB is required to store the proposed compressed reference table.

Learning Multiple Linear Mappings for EfficientSingle Image Super-Resolution

Example learning-based superresolution (SR) algorithms show promise for restoring a high-resolution (HR) image from a single low-resolution (LR) input. The most popular approaches, however, are either time- or space-intensive, which limits their practical applications in many resource-limited settings. In this paper, we propose a novel computationally efficient single image SR method that learns multiple linear mappings (MLM) to directly transform LR feature

subspaces into HR subspaces. In particular, we first partition the large nonlinear feature space of LR images into a cluster of linear subspaces. Multiple LR subdictionaries are then learned, followed by inferring the corresponding HR subdictionaries based on the assumption that the LR-HR features share the same representation coefficients. We establish MLM from the input LR features to the desired HR outputs in order to achieve fast yet stable SR recovery. Furthermore, in order to suppress displeasing artifacts generated by the MLM-based method, we apply a fast nonlocal means algorithm to construct a simple yet effective similarity-based regularization term for SR enhancement. Experimental results indicate that our approach is both quantitatively and qualitatively superior to other application-oriented SR methods, while maintaining relatively low time and space complexity.

Cross-Domain Person Re-Identification UsingDomain Adaptation Ranking SVMs

This paper addresses a new person reidentification problem without label information of persons under nonoverlapping target cameras. Given the matched (positive) and unmatched (negative) image pairs from source domain cameras, as well as unmatched (negative) and unlabeled image pairs from target domain cameras, we propose an adaptive ranking support vector machines (AdaRSVMs) method for reidentification under target domain cameras without person labels. To overcome the problems introduced due to the absence of matched (positive) image pairs in the target domain, we relax the discriminative constraint to a necessary condition only relying on the positive mean in the target domain. To estimate the target positive mean, we make use of all the available data from source and target domains as well as constraints in person reidentification. Inspired by adaptive learning methods, a new discriminative model with high confidence in target positive mean and low confidence in target negative image pairs is developed by refining the distance model learnt from the source domain. Experimental results show that the proposed AdaRSVM outperforms existing supervised or unsupervised, learning or non-learning reidentification methods without using label information in target cameras. Moreover, our method achieves better reidentification performance than existing domain adaptation methods derived under equal conditional probability assumption.

Structure-Sensitive Saliency Detectionvia Multilevel Rank Analysis inIntrinsic Feature Space

This paper advocates a novel multiscale, structure-sensitive saliency detection method, which can distinguish multilevel, reliable saliency from various natural pictures in a robust and versatile way. One key challenge for saliency detection is to guarantee the entire salient object being characterized differently from nonsalient background. To tackle this, our strategy is to design a structure-aware descriptor based on the intrinsic biharmonic distance metric. One benefit of introducing this descriptor is its ability to simultaneously integrate local and global structure information, which is extremely valuable for separating the salient object from nonsalient background in a multiscale sense. Upon devising such powerful shape descriptor, the remaining challenge is to capture the saliency to make sure that salient subparts actually stand out among all possible candidates. Toward this goal, we conduct multilevel low-rank and sparse analysis in the intrinsic feature space spanned by the shape descriptors defined on over-segmented super-pixels. Since the low-rank property emphasizes much more on stronger similarities among super-pixels, we naturally obtain a scale space along the rank dimension in this way. Multiscale saliency can be obtained by simply computing differences among the low-rank components across the rank scale. We conduct extensive experiments on some public benchmarks, and make comprehensive, quantitative evaluation between our method and existing state-of-the-art techniques. All the results demonstrate the superiority of our method in accuracy, reliability, robustness, and versatility.

Depth Reconstruction From Sparse Samples:Representation, Algorithm, and Sampling

The rapid development of 3D technology and computer vision applications have motivated a thrust of methodologies for depth acquisition and estimation. However, most existing hardware and software methods have limited performance due to poor depth precision, low resolution and high computational cost. In this paper, we present a computationally efficient method to recover dense depth maps from sparse measurements. We make three contributions. First, we provide empirical evidence that depth maps can be encoded much more sparsely than natural images by using common dictionaries such as wavelets and contourlets. We also show that a combined wavelet-contourlet dictionary achieves better performance than using either dictionary alone. Second, we propose an alternating direction method of multipliers (ADMM) to achieve fast reconstruction. A multi-scale warm start procedure is proposed to speed up the convergence. Third, we propose a two-stage randomized sampling scheme to optimally choose the sampling locations, thus maximizing the reconstruction performance for any given sampling budget. Experimental results show that the proposed method produces high quality dense depth

estimates, and is robust to noisy measurements. Applications to real data in stereo matching are demonstrated.

Image Denoising by Exploring Externaland Internal Correlations

Single image denoising suffers from limited data collection within a noisy image. In this paper, we propose a novel image denoising scheme, which explores both internal and external correlations with the help of web images. For each noisy patch, we build internal and external data cubes by finding similar patches from the noisy and web images, respectively. We then propose reducing noise by a two-stage strategy using different filtering approaches. In the first stage, since the noisy patch may lead to inaccurate patch selection, we propose a graph based optimization method to improve patch matching accuracy in external denoising. The internal denoising is frequency truncation on internal cubes. By combining the internal and external denoising patches, we obtain a preliminary denoising result. In the second stage, we propose reducing noise by filtering of external and internal cubes, respectively, on transform domain. In this stage, the preliminary denoising result not only enhances the patch matching accuracy but also provides reliable estimates of filtering parameters. The final denoising image is obtained by fusing the external and internal filtering results. Experimental results show that our method constantly outperforms state-of-the-art denoising schemes in both subjective and objective quality measurements, e.g., it achieves >2 dB gain compared with BM3D at a wide range of noise levels.

Motion-Compensated Coding and Frame RateUp-Conversion: Models and Analysis

Block-based motion estimation (ME) and compensation (MC) techniques are widely used in modern video processing algorithms and compression systems. The great variety of video applications and devices results in numerous compression specifications. Specifically, there is a diversity of frame-rates and bit-rates. In this paper, we study the effect of frame-rate and compression bit-rate on block-based ME and MC as commonly utilized in inter-frame coding and frame-rate up conversion (FRUC). This joint examination yields a comprehensive foundation for comparing MC procedures in coding and FRUC. First, the video signal is modeled as a noisy translational motion of an image. Then, we theoretically model the motion-compensated prediction of an available and absent frames as in coding and FRUC applications,

respectively. The theoretic MC-prediction error is further analyzed and its autocorrelation function is calculated for coding and FRUC applications. We show a linear relation between the variance of the MC-prediction error and temporal-distance. While the affecting distance in MC-coding is between the predicted and reference frames, MC-FRUC is affected by the distance between the available frames used for the interpolation. Moreover, the dependency in temporal-distance implies an inverse effect of the frame-rate. FRUC performance analysis considers the prediction error variance, since it equals to the mean-squared-error of the interpolation. However, MC-coding analysis requires the entire autocorrelation function of the error; hence, analytic simplicity is beneficial. Therefore, we propose two constructions of a separable autocorrelation function for prediction error in MC-coding. We conclude by comparing our estimations with experimental results.

Fractal Analysis for Reduced ReferenceImage Quality Assessment

In this paper, multifractal analysis is adapted to reduced-reference image quality assessment (RR-IQA). A novel RR-QA approach is proposed, which measures the difference of spatial arrangement between the reference image and the distorted image in terms of spatial regularity measured by fractal dimension. An image is first expressed in Log-Gabor domain. Then, fractal dimensions are computed on each Log-Gabor subband and concatenated as a feature vector. Finally, the extracted features are pooled as the quality score of the distorted image using ℓ1 distance. Compared with existing approaches, the proposed method measures image quality from the perspective of the spatial distribution of image patterns. The proposed method was evaluated on seven public benchmark data sets. Experimental results have demonstrated the excellent performance of the proposed method in comparison with state-of-the-art approaches.

Criteria-Based Modulation for Multilevel Inverters

Pulse width modulation schemes are aimed at adjusting the fundamental component while reducing the harmonic content of an inverter output voltage or current. This paper addresses the topic of optimal inverter operation in reference to a given objective function. The objective function could embody either a single performance criterion, such as voltage or current total harmonic distortion (THD), or a weighted sum of multiple criteria. The proposed method ensures primacy of the chosen solution while imposing no restriction over its modulation index. In particular, operating the inverter by the chosen solution would result in performances superior to

any other modulation scheme commutating in an equal number of switching angles per fundamental cycle. The proposed method allows for the consideration of practical inverter constraints and prevents the possibility of impractical switching sequence. A detailed investigation of the method is given, accompanied by two practical cases minimizing, respectively, phase-voltage THD and line-current THD of a three-level inverter. Selected simulation and experimental results are presented to validate the theoretical part.

A Fully Soft-Switched Single Switch IsolatedDC-DC Converter

This paper proposes a single switch soft switching isolated converter. The proposed converter is able to offer low cost and high power density in high step up application due to the following features: ZCS turn on and ZVS turn off of switch and ZCS turn off of diodes regardless of voltage and load variation; simple gate driver circuit and low rated lossless snubeer; reduced transformer volume compared to flyback or forward based converters due to low magnetizing current. Experimental results on a 100kHz, 250W prototype are provided to validate the proposed concept.

Model Predictive Control Methods to Reduce Common-Mode Voltagefor Three-Phase Voltage Source Inverters

In this paper, we propose model predictive control methods to reduce the common-mode voltage of three-phase voltage source inverters (VSIs). In the reduced common-mode voltage-model predictive control (RCMV-MPC) methods proposed in this paper, only nonzero voltage vectors are utilized to reduce the common-mode voltage as well as to control the load currents. In addition, two nonzero voltage vectors are selected from the cost function at every sampling period, instead of using only one optimal vector during one sampling period. The two selected nonzero vectors are distributed in one sampling period in such a way as to minimize the error between the measured load current and the reference. Without utilizing the zero vectors, the common-mode voltage controlled by the proposed RCMV-MPC algorithms can be restricted within ±Vdc/6. Furthermore, application of the two nonzero vectors with optimal time sharing between them can yield satisfactory load current ripple performance without using the zero vectors. Thus, the proposed RCMV-MPC methods can reduce the common-mode voltage as well as control the load currents with fast transient response and satisfactory load current ripple

performance compared with the conventional model predictive control method. Simulation and experimental results are included to verify the effectiveness of the proposed RCMV-MPC methods.

Interleaved Phase-Shift Full-Bridge Converter WithTransformer Winding Series–Parallel Autoregulated(SPAR) Current Doubler Rectifier

The analysis and design guidelines for a two-phase interleaved phase-shift full-bridge converter with transformer winding series-parallel autoregulated current doubler rectifier are presented in this paper. The secondary windings of two transformers work in parallel when the equivalent duty cycle is smaller than 0.25 but in series when the duty cycle is larger than 0.25 owing to the series-parallel autoregulated rectifier. With the proposed rectifying structure, the voltage stress of the rectifier is reduced. Also, the interleaving operation reduces the output current ripple. A 1-kW prototype with 200-400-V input and 50-V/20-A output is built up to verify the theoretical analysis.

Analysis of Active-Network Converter with CoupledInductors

High step-up voltage gain dc-dc converters are widely applied in fuel cell stacks, photovoltaic arrays, battery sources, and high intensity discharge lamps power systems. Active-network converters with coupled inductors (CL-ANC) are derived from switched inductor active-network converters (SL-ANC). The proposed converter contains two coupled inductors which can be integrated into one magnetic core and two power switches. The converter can provide a relatively high voltage conversion ratio with a small duty cycle; the voltage and current stress of power switches are low which is helpful to reduce the losses. This paper shows the key waveforms of the CL-ANC and detailed derivation of the steady-state operation principle. The voltage conversion ratio and the effect of the leakage inductance on voltage gain are discussed. The voltage stress and current stress on the power devices are illustrated and the comparison between the proposed converter and SL-ANC are given. Finally, the prototype has been established in the lab with 200 V and 400 V output under different turn ratios. Experimental results are given to verify the correctness of the analysis.

Modeling and Controller Design of a Semi-IsolatedMulti-Input Converter for Hybrid PV/Wind PowerCharger System

The objective of this paper is to propose the development of a multiinput dc-dc converter (MIC) family, which is composed of isolated and/or nonisolated dc-dc converters. By analyzing five basic isolated dc-dc converters, four isolated pulsating voltage source cells and three isolated pulsating current source cells are generated. Moreover, a semiisolated multiinput converter (S-MIC) for hybrid PV/wind power charger system which can simplify the power system, reduce the cost, deliver continuous power, and overcome high-voltage-transfer-ratio problems is proposed. In this paper, the operational principle of the proposed S-MIC is explained, the small-signal ac model is derived, and the controller design is developed. Computer simulations and experimental results are presented to verify the accuracy of the proposed small-signal ac model and the performance of the proposed S-MIC.

A Four-Switch Three-Phase SEPIC-Based Inverter

The four-switch three-phase (FSTP) inverter has been proposed as an innovative inverter design to reduce the cost, complexity, size, and switching losses of the dc-ac conversion system. Traditional FSTP inverter usually operates at half the dc input voltage; hence, the output line voltage cannot exceed this value. This paper proposes a novel design for the FSTP inverter based on the topology of the single-ended primary-inductance converter (SEPIC). The proposed topology provides pure sinusoidal output voltages with no need for output filter. Compared to traditional FSTP inverter, the proposed FSTP SEPIC inverter improves the voltage utilization factor of the input dc supply, where the proposed topology provides higher output line voltage which can be extended up to the full value of the dc input voltage. The integral sliding-mode control is used with the proposed topology to optimize its dynamics and to ensure robustness of the system during different operating conditions. Derivation of the equations describing the parameters design, components ratings, and the operation of the proposed SEPIC inverter is presented in this paper. Simulation model and experimental setup are used to validate the proposed concept. Simulations and experimental results show the effectiveness of the proposed inverter.

High-Efficiency Isolated Single-Input Multiple-OutputBidirectional Converter

This study presents a high-efficiency-isolated single-input multiple-output bidirectional (HISMB) converter for a power storage system. According to the power management, the proposed HISMB converter can operate at a step-up state (energy release) and a step-down state (energy storage). At the step-up state, it can boost the voltage of a low-voltage input power source to a high-voltage-side dc bus and middle-voltage terminals. When the high-voltage-side dc bus has excess energy, one can reversely transmit the energy. The high-voltage dc bus can take as the main power, and middle-voltage output terminals can supply powers for individual middle-voltage dc loads or to charge auxiliary power sources (e.g., battery modules). In this study, a coupled-inductor-based HISMB converter accomplishes the bidirectional power control with the properties of voltage clamping and soft switching, and the corresponding device specifications are adequately designed. As a result, the energy of the leakage inductor of the coupled inductor can be recycled and released to the high-voltage-side dc bus and auxiliary power sources, and the voltage stresses on power switches can be greatly reduced. Moreover, the switching losses can be significantly decreased because of all power switches with zero-voltage-switching features. Therefore, the objectives of high-efficiency power conversion, electric isolation, bidirectional energy transmission, and various output voltage with different levels can be obtained. The effectiveness of the proposed HISMB converter is verified by experimental results of a kW-level prototype in practical applications.

Modularized Control Strategy and PerformanceAnalysis of DFIG System under Unbalanced andHarmonic Grid Voltage

The paper presents a modularized control strategy of doubly fed induction generator (DFIG) system, including the grid-side converter (GSC) and rotor-side converter (RSC), under unbalanced and harmonic grid voltage. The sequence decomposition process and complicated control reference calculation can be avoided in the proposed control strategy. From the perspective of power grid friendly operation, the control targets of DFIG system in this paper are chosen as: 1) smooth active and reactive power injected into the power grid; 2) balanced and sinusoidal current injected into the power grid. The RSC and GSC can work as two independent modules and the communication between RSC and GSC can be removed. Furthermore, the third harmonic current component, dc-link voltage fluctuation, and electromagnetic torque pulsation under the different control targets are theoretically analyzed. Finally, the availability of the

proposed modularized control strategy of DFIG system under unbalanced and distorted grid voltage is verified by experiment results.

Resonant Switched-Capacitor VoltageRegulator with Ideal Transient Response

A new, small and efficient voltage regulator realized using a resonant switched capacitor converter (RSCC) technology is introduced. Voltage regulation is implemented by means of simple digital pulse density modulation (PDM). It displays an ideal transient response with a zero-order response to all disturbance types. The newly developed RSCC acts as a gyrator with a wide range of voltage conversion ratios (below as well as above unity) with constant efficiency characteristics for the entire operation range. The operation of the voltage regulator is verified on a 20W experimental prototype, demonstrating ideal transient recovery without over/undershoots in response to load and line transients. Simple design guidelines for the voltage regulation system are provided and verified by experiments.

On the Performance of Multiobjective EvolutionaryAlgorithms in Automatic Parameter Extraction ofPower Diodes

In this paper, a general, robust, and automatic parameter extraction of nonlinear compact models is presented. The parameter extraction is based on multiobjective optimization using evolutionary algorithms, which allow fitting of several highly nonlinear and highly conflicting characteristics simultaneously. Two multiobjective evolutionary algorithms which have been proved to be robust for a wide range of multiobjective problems [1]-[3], the nondominated sorting genetic algorithm II and the multiobjective covariance matrix adaptation evolution strategy, are used in the parameter extraction of a novel power diode compact model based on the lumped charge technique. The performance of the algorithms is assessed using a systematic statistical approach. Good agreement between the simulated and measured characteristics of the power diode shows the accuracy of the used compact model and the efficiency and effectiveness of the proposed multiobjective optimization scheme.

Development of a Wind Interior Permanent-MagnetSynchronous Generator Based Microgrid and ItsOperation Control

This paper presents the development of a wind interior permanent-magnet synchronous generator (IPMSG)-based dc microgrid and its operation control. First, the derated characteristics of PMSG systems with various ac/dc converters and operation controls are comparatively analyzed. Then, the IPMSG followed by a three-phase Vienna switch-mode rectifier is developed to establish the common dc bus of dc microgrid. Good developed power and voltage regulation characteristics are achieved via the proposed commutation tuning, robust current, and voltage controls. Second, a single-phase three-wire inverter is constructed to serve as the test load. Good ac 220 V/110 V output voltage waveforms under unknown and nonlinear loads are preserved by the developed robust waveform tracking control scheme. Third, a battery energy storage system is established, and the fast energy storage support response is obtained via the proposed droop control approach with adaptive predictive current control method. In addition, a chopped dump load is equipped to enhance the energy balance control flexibility.

A Novel Drive Method for High-SpeedBrushless DC Motor Operating in a WideRange

In this paper, a novel drive method, which is different from the traditional motor drive techniques, for high-speed brushless DC (BLDC) motor is proposed and verified by a series of experiments. It is well known that the BLDC motor can be driven by either pulse-width modulation (PWM) techniques with a constant dc-link voltage or pulse-amplitude modulation (PAM) techniques with an adjustable dc-link voltage. However, to our best knowledge, there is rare study providing a proper drive method for a high-speed BLDC motor with a large power over a wide speed range. Therefore, the detailed theoretical analysis comparison of the PWM control and the PAM control for high-speed BLDC motor is first given. Then, a conclusion that the PAM control is superior to the PWM control at high speed is obtained because of decreasing the commutation delay and high-frequency harmonic wave. Meanwhile, a new high-speed BLDC motor drive method based on the hybrid approach combining PWM and PAM is proposed. Finally, the feasibility and effectiveness of the performance analysis comparison and the new drive method are verified by several experiments.

The Dynamic Control of Reactive Power for theBrushless Doubly Fed Induction Machine withIndirect Stator-quantities Control Scheme

Compared to the doubly fed induction machine (DFIM), the brushless doubly fed induction machine (BDFIM) has higher reliability by virtue of the absence of a brush gear. Recent research on structure optimization design and control strategy of BDFIM has made remarkable progress. BDFIM indirect stator-quantities control (ISC) is a new control strategy, which, in comparison to vector control strategy, requires fewer parameters and does not need rotating coordinate transformation. This paper further develops the dynamic control of reactive power for the BDFIM with ISC scheme. Detailed theoretical analysis is done to show the controller structure of the reactive power. The experimental results of the prototype show the feasibility of the proposed scheme. As a result, the proposed ISC controllers have been able to control not only speed and torque, but also the reactive power.

An LCL-LC Filter for Grid-ConnectedConverter: Topology, Parameter and Analysis

n order to further cut down the cost of filter for grid-connected pulsewidth modulation (PWM) converter under the more and more stringent grid code, a new kind of high-order filter, named LCL-LC filter, is presented in this paper. The resonant frequency characteristics of the filter are analyzed, and a parameter design method on the base of the characteristics is also proposed in the paper. The proposed parameter design method can easily make full use of the existing research results about the traditional LCL filter parameter design. And then a parameter robustness analysis method based on four-dimensional graphics is proposed to analyze parameter robustness of the presented filter. Compared with the traditional one, the proposed analysis method can analyze the filter performance under variations of several parameters at a time without any iteration. The comparative analysis and discussion considering the LCL filter, the trap filter, and the LCL- LC filter, are presented and verified through the experiments on a 5 kW grid-connected converter prototype. Experiment results demonstrate the accuracy of theoretical analysis and prove that the presented filter has a better performance than two others.

3D microtransformers for DC-DC on-chippower conversion

We address the miniaturization of power converters by introducing novel 3-D microtransformers with magnetic core for low-megahertz frequency applications. The core is fabricated by lamination and microstructuring of Metglas 2714A magnetic alloy. The solenoids of the microtransformers are wound around the core using a ball-wedge wirebonder. The wirebonding process is fast, allowing the fabrication of solenoids with up to 40 turns in 10 s. The fabricated devices yield the high inductance per unit volume of 2.95 μH/mm3 and energy per unit volume of 133 nJ/mm3 at the frequency of 1 MHz. The power efficiency of 64-76% is measured for different turns ratio with coupling factors as high as 98%. To demonstrate the applicability of our passive components, two PWM controllers were selected to implement an isolated and a nonisolated switch-mode power supply. The isolated converter operates with overall efficiency of 55% and maximum output power of 136 mW; then, we experimentally demonstrate how we increased this efficiency to 71% and output power to 408 mW. The nonisolated converter can deliver an overall efficiency of 81% with a maximum output power of 515 mW. Finally, we benchmarked the results to underline the potential of the technology for power on-chip applications.

Indirect Matrix Converter-Based Topologyand Modulation Schemes for EnhancingInput Reactive Power Capability

A new topology based on indirect matrix converter (IMC) is proposed to enhance the input reactive power capability. This topology consists of a conventional IMC and an auxiliary switching network (ASN), which is connected to the dc-link of the IMC in parallel. With the aid of ASN, an implicit current source converter-based static synchronous compensator can be embedded into an IMC, which lays a foundation for the input reactive power control. Based on the proposed topology, two modulation schemes are presented, and the formations of the output voltage and input reactive current are decoupled in both of them. To minimize power loss and improve input current quality, a double closed-loop control algorithm is introduced, in which the current through the dc inductor in ASN is controlled to be minimum. Different from the conventional IMC, the input reactive power of the topology is independent of its load condition without considering the practical constraints. The effectiveness of the proposed topology and modulation scheme is confirmed by experimental results.

Closed Loop Discontinuous Modulation Techniquefor Capacitor Voltage Ripples and Switching LossesReduction in Modular Multilevel Converters

In this paper, a new discontinuous modulation technique is presented for the operation of the modular multilevel converter (MMC). The modulation technique is based on adding a zero sequence to the original modulation signals so that the MMC arms are clamped to the upper or lower terminals of the dc-link bus. The clamping intervals are controlled according to the absolute value of the output current to minimize the switching losses of the MMC. A significant reduction in the capacitor voltage ripples is achieved, especially when operating with low modulation indices. Furthermore, a circulating current control strategy suitable for this modulation technique is also proposed. Simulation and experimental results under various operating points are reported along with evaluation and comparison results against a conventional carrier-based pulse width modulation method.

Decentralized Inverse-Droop Control forInput-Series-Output-Parallel DC-DC Converters

Input-series-output-parallel dc-dc converters are suited for high-input voltage and low output voltage applications. This letter presents a decentralized inverse-droop control for this configuration. Each module is self-contained and no central controller is needed; thus, improving the system modularity, reliability, and flexibility. With the proposed inverse-droop control, the output voltage reference rises as the load becomes heavy. Even though the input voltages are not used in the inverse-droop loop, the power sharing including input voltage sharing and output current sharing can still be well achieved. Besides, the output voltage regulation characteristic is not affected by the variation of input voltage. The operation principle is introduced, and stability of the strategy is also revealed based on small signal modeling. Finally, the experiment is conducted to verify the effectiveness of the control strategy.

Detailed Analysis of DC-Link Virtual Impedancebased Suppression Method for Harmonics Interactionin High-Power PWM Current-Source Motor Drives

For high-power PWM current-source motor drive systems, due to the low converter switching frequency and the relative small dc choke for reduced cost/weight, the converters' switching harmonics may interact through dc link and produce interharmonics in the entire system. Such harmonics interaction phenomenon may give rise to the system resonance at certain motor speeds, which degrades the grid-side power quality and generates excessive torque ripples on the motor side. The resonance caused by the harmonics interaction in high-power PWM current-source motor drives is investigated in previous study. In addition, to actively suppress such resonance, the basic idea of a dc-link virtual impedance-based suppression method has also been proposed. This paper extends the previous study to thoroughly analyze the mechanism and realization of resonance suppression by the dc-link virtual impedance-based method. The in-depth analysis shows that the dc-link virtual impedance-based method successfully enables the active interharmonics compensation capability of high-power PWM current-source drives, which is not addressed in previous researches. Moreover, simulations and experiments demonstrate that, by following the selection of coefficient in the suppression method discussed in this paper, the dc-link virtual impedance-based method can effectively enhance the attenuation effect of dc link in high-power PWM current-source drive systems so as to suppress the resonance due to the harmonics interaction under all resonance conditions.

An Online Frequency-Domain Junction Temperature EstimationMethod for IGBT Modules

This letter proposes a new frequency-domain thermal model for online junction temperature estimation of insulated-gate bipolar transistor (IGBT) modules. The proposed model characterizes the thermal behavior of an IGBT module by a linear time-invariant (LTI) system, whose frequency response is obtained by applying the fast Fourier transform (FFT) to the time derivative of the transient thermal impedance from junction to a reference position of the IGBT module. The junction temperature of the IGBT is then estimated using the frequency responses of the LTI system and the heat sources of the IGBT module. Simulation results show that the proposed method is computationally efficient for an accurate online junction temperature estimation of IGBT modules in both steady-state and transient loading conditions.

Characterization of a Silicon IGBT and SiliconCarbide MOSFET Cross Switch Hybrid

A parallel arrangement of a silicon (Si) IGBT and a silicon carbide (SiC) MOSFET is experimentally demonstrated. The concept referred to as the cross-switch (XS) hybrid aims to reach optimum power device performance by providing low static and dynamic losses while improving the overall electrical and thermal properties due to the combination of both the bipolar Si IGBT and unipolar SiC MOSFET characteristics. For the purpose of demonstrating the XS hybrid, the parallel configuration is implemented experimentally in a single package for devices rated at 1200 V. Test results are obtained to validate this approach with respect to the static and dynamic performance when compared to a full Si IGBT and a full SiC MOSFET reference devices having the same power ratings as for the XS hybrid samples.

LCL Filter Design and Inductor Current Ripple Analysis for 3-level NPC Grid Interface Converter

The harmonic filter for a three-level neutral-point-clamped (NPC) grid interface converter is designed in this paper with good filtering performance and small component size. LCL topology is selected because of the attenuation and size tradeoff. The design of the inverter-side inductor L1 is emphasized due to its cost. A detailed inductor current ripple analysis is given based on the space vector modulation. The analysis derives the inductor volt-second and the maximum current ripple equation in line cycle. It also reveals the switching cycle current ripple distribution over a line cycle, with the consideration of power factor. The total system loss is calculated with different ripple current. Inductor L1 is determined by the loss and size tradeoff. Also the capacitor- and grid-side inductor L2 is designed based on attenuation requirement. Different damping circuits for LCL filter are compared and investigated in detail. The filter design is verified by both simulation and 200-kVA three-level NPC converter hardware.

Virtual RC Damping of LCL-Filtered VoltageSource Converters with Extended SelectiveHarmonic Compensation

Active damping and harmonic compensation are two common challenges faced by LCL-filtered voltage source converters. To manage them holistically, this paper begins by proposing a virtual RC damper in parallel with the passive filter capacitor. The virtual damper is actively inserted by feeding back the passive capacitor current through a high-pass filter, which indirectly, furnishes two superior features. They are the mitigation of phase lag experienced by a conventional damper and the avoidance of instability caused by the negative resistance inserted

unintentionally. Moreover, with the virtual RC damper, the frequency region, within which the harmonic compensation is effective, can be extended beyond the gain crossover frequency. This is of interest to some high-performance applications, but has presently not been achieved by existing schemes. Performance of the proposed scheme has been tested in the laboratory with results obtained for demonstrating stability and harmonic compensation.

Versatile Control of Unidirectional AC-DCBoost Converters for Power Quality Mitigation

This paper introduces a versatile control scheme for unidirectional ac-dc boost converters for the purpose of mitigating grid power quality. Since most power factor correction circuits available in the commercial market utilize unidirectional ac-dc boost converter topologies, this is an almost no-cost solution for compensating harmonic current and reactive power in residential applications. Harmonic current and reactive power compensation methods in the unidirectional ac-dc boost converter are investigated. The additional focus of this paper is to quantify the input current distortions by the unidirectional ac-dc boost converter used for supplying not only active power to the load but also reactive power. Due to input current distortions, the amount of reactive power injected from an individual converter to the grid should be restricted. Experimental results are presented to validate the effectiveness of the proposed control method.

Aalborg Inverter — A new type of “Buck inBuck, Boost in Boost” Grid-tied Inverter

This paper presents a new family of high efficiency DC/AC grid-tied inverter with a wide variation of input DC voltage. It is a kind of “Boost in Boost, Buck in Buck” inverter, meaning that only one power stage works at the high frequency to achieve the minimum switching loss. Furthermore, the minimum filtering inductance in the power loop is achieved to reduce the conduction power loss whether in “Boost” or “Buck” mode. In theory, it can achieve higher efficiency than other inverters under the same condition of input DC voltage. The principle of operation is fully illustrated through the analysis on the equivalent circuits of a “three-level” single-phase inverter. Simulations show it has good control performance.

Grid-connected Forward Micro-inverter with Primary-Parallel Secondary-SeriesTransformer

This paper presents a primary-parallel secondary-series multicore forward microinverter for photovoltaic ac-module application. The presented microinverter operates with a constant off-time boundary mode control, providing MPPT capability and unity power factor. The proposed multitransformer solution allows using low-profile unitary turns ratio transformers. Therefore, the transformers are better coupled and the overall performance of the microinverter is improved. Due to the multiphase solution, the number of devices increases but the current stress and losses per device are reduced contributing to an easier thermal management. Furthermore, the decoupling capacitor is split among the phases, contributing to a low-profile solution without electrolytic capacitors suitable to be mounted in the frame of a PV module. The proposed solution is compared to the classical parallel-interleaved approach, showing better efficiency in a wide power range and improving the weighted efficiency.

A Single-Stage PhotoVoltaic System for a DualInverter fed Open-End Winding Induction MotorDrive for Pumping Applications

This paper presents an integrated solution for a photovoltaic (PV)-fed water-pump drive system, which uses an open-end winding induction motor (OEWIM). The dual-inverter-fed OEWIM drive achieves the functionality of a three-level inverter and requires low value dc-bus voltage. This helps in an optimal arrangement of PV modules, which could avoid large strings and helps in improving the PV performance with wide bandwidth of operating voltage. It also reduces the voltage rating of the dc-link capacitors and switching devices used in the system. The proposed control strategy achieves an integration of both maximum power point tracking and V/f control for the efficient utilization of the PV panels and the motor. The proposed control scheme requires the sensing of PV voltage and current only. Thus, the system requires less number of sensors. All the analytical, simulation, and experimental results of this work under different environmental conditions are presented in this paper.

Vector Sparse Representation of Color Image Using Quaternion Matrix Analysis

Traditional sparse image models treat color image pixel as a scalar, which represents color channels separately or concatenate color channels as a monochrome image. In this paper, we propose a vector sparse representation model for color images using quaternion matrix analysis. As a new tool for color image representation, its potential applications in several image-processing tasks are presented, including color image reconstruction, denoising, inpainting, and

super-resolution. The proposed model represents the color image as a quaternion matrix, where a quaternion-based dictionary learning algorithm is presented using the K-quaternion singular value decomposition (QSVD) (generalized K-means clustering for QSVD) method. It conducts the sparse basis selection in quaternion space, which uniformly transforms the channel images to an orthogonal color space. In this new color space, it is significant that the inherent color structures can be completely preserved during vector reconstruction. Moreover, the proposed sparse model is more efficient comparing with the current sparse models for image restoration tasks due to lower redundancy between the atoms of different color channels. The experimental results demonstrate that the proposed sparse image model avoids the hue bias issue successfully and shows its potential as a general and powerful tool in color image analysis and processing domain.

Matching of Large Images Through Coupled Decomposition

In this paper, we address the problem of fast and accurate extraction of points that correspond to the same location (named tie-points) from pairs of large-sized images. First, we conduct a theoretical analysis of the performance of the full-image matching approach, demonstrating its limitations when applied to large images. Subsequently, we introduce a novel technique to impose spatial constraints on the matching process without employing subsampled versions of the reference and the target image, which we name coupled image decomposition. This technique splits images into corresponding subimages through a process that is theoretically invariant to geometric transformations, additive noise, and global radiometric differences, as well as being robust to local changes. After presenting it, we demonstrate how coupled image decomposition can be used both for image registration and for automatic estimation of epipolar geometry. Finally, coupled image decomposition is tested on a data set consisting of several planetary images of different size, varying from less than one megapixel to several hundreds of megapixels. The reported experimental results, which includes comparison with full-image matching and state-of-the-art techniques, demonstrate the substantial computational cost reduction that can be achieved when matching large images through coupled decomposition, without at the same time compromising the overall matching accuracy.

Accurate Stereo Matching by Two-Step Energy Minimization

In stereo matching, cost-filtering methods and energy-minimization algorithms are considered as two different techniques. Due to their global extent, energy-minimization methods obtain good

stereo matching results. However, they tend to fail in occluded regions, in which cost-filtering approaches obtain better results. In this paper, we intend to combine both the approaches with the aim to improve overall stereo matching results.We show that a global optimization with a fully connected model can be solved by cost-filtering methods. Based on this observation, we propose to perform stereo matching as a two-step energy-minimization algorithm. We consider two Markov random field (MRF) models: 1) a fully connected model defined on the complete set of pixels in an image and 2) a conventional locally connected model. We solve the energy-minimization problem for the fully connected model, after which the marginal function of the solution is used as the unary potential in the locally connected MRF model. Experiments on the Middlebury stereo data sets show that the proposed method achieves the state-of-the-arts results.

Hierarchical Graphical Models for Simultaneous Tracking and Recognition in Wide-Area Scenes

We present a unified framework to track multiple people, as well localize, and label their activities, in complex long-duration video sequences. To do this, we focus on two aspects: 1) the influence of tracks on the activities performed by the corresponding actors and 2) the structural relationships across activities. We propose a two-level hierarchical graphical model, which learns the relationship between tracks, relationship between tracks, and their corresponding activity segments, as well as the spatiotemporal relationships across activity segments. Such contextual relationships between tracks and activity segments are exploited at both the levels in the hierarchy for increased robustness. An L1-regularized structure learning approach is proposed for this purpose. While it is well known that availability of the labels and locations of activities can help in determining tracks more accurately and vice-versa, most current approaches have dealt with these problems separately. Inspired by research in the area of biological vision, we propose a bidirectional approach that integrates both bottom-up and top-down processing, i.e., bottom-up recognition of activities using computed tracks and top-down computation of tracks using the obtained recognition. We demonstrate our results on the recent and publicly available UCLA and VIRAT data sets consisting of realistic indoor and outdoor surveillance sequences.

CID2013: A Database for Evaluating No-Reference Image Quality Assessment Algorithms

This paper presents a new database, CID2013, to address the issue of using no-reference (NR) image quality assessment algorithms on images with multiple distortions. Current NR algorithms struggle to handle images with many concurrent distortion types, such as real photographic

images captured by different digital cameras. The database consists of six image sets; on average, 30 subjects have evaluated 12-14 devices depicting eight different scenes for a total of 79 different cameras, 480 images, and 188 subjects (67% female). The subjective evaluation method was a hybrid absolute category rating-pair comparison developed for the study and presented in this paper. This method utilizes a slideshow of all images within a scene to allow the test images to work as references to each other. In addition to mean opinion score value, the images are also rated using sharpness, graininess, lightness, and color saturation scales. The CID2013 database contains images used in the experiments with the full subjective data plus extensive background information from the subjects. The database is made freely available for the research community.

Density Maximization for Improving Graph Matching With Its Applications

Graph matching has been widely used in both image processing and computer vision domain due to its powerful performance for structural pattern representation. However, it poses three challenges to image sparse feature matching: 1) the combinatorial nature limits the size of the possible matches; 2) it is sensitive to outliers because its objective function prefers more matches; and 3) it works poorly when handling many-to-many object correspondences, due to its assumption of one single cluster of true matches. In this paper, we address these challenges with a unified framework called density maximization (DM), which maximizes the values of a proposed graph density estimator both locally and globally. DM leads to the integration of feature matching, outlier elimination, and cluster detection. Experimental evaluation demonstrates that it significantly boosts the true matches and enables graph matching to handle both outliers and many-to-many object correspondences. We also extend it to dense correspondence estimation and obtain large improvement over the state-of-the-art methods. We further demonstrate the usefulness of our methods using three applications: 1) instance-level image retrieval; 2) mask transfer; and 3) image enhancement.

Edge-Preserving Image Denoising via Group Coordinate Descent on the GPU

Image denoising is a fundamental operation in image processing, and its applications range from the direct (photographic enhancement) to the technical (as a subproblem in image reconstruction algorithms). In many applications, the number of pixels has continued to grow, while the serial execution speed of computational hardware has begun to stall. New image processing algorithms must exploit the power offered by massively parallel architectures like graphics processing units

(GPUs). This paper describes a family of image denoising algorithms well-suited to the GPU. The algorithms iteratively perform a set of independent, parallel 1D pixel-update subproblems. To match GPU memory limitations, they perform these pixel updates in-place and only store the noisy data, denoised image, and problem parameters. The algorithms can handle a wide range of edge-preserving roughness penalties, including differentiable convex penalties and anisotropic total variation. Both algorithms use the majorize-minimize framework to solve the 1D pixel update subproblem. Results from a large 2D image denoising problem and a 3D medical imaging denoising problem demonstrate that the proposed algorithms converge rapidly in terms of both iteration and run-time.

Single Image Super-Resolution Based on Gradient Profile Sharpness

Single image super-resolution is a classic and active image processing problem, which aims to generate a high resolution image from a low resolution input image. Due to the severely under-determined nature of this problem, an effective image prior is necessary to make the problem solvable, and to improve the quality of generated images. In this paper, a novel image superresolution algorithm is proposed based on GPS (Gradient Profile Sharpness). GPS is an edge sharpness metric, which is extracted from two gradient description models, i.e. a triangle model and a Gaussian mixture model for the description of different kinds of gradient profiles. Then the transformation relationship of GPSs in different image resolutions is studied statistically, and the parameter of the relationship is estimated automatically. Based on the estimated GPS transformation relationship, two gradient profile transformation models are proposed for two profile description models, which can keep profile shape and profile gradient magnitude sum consistent during profile transformation. Finally, the target gradient field of HR (high resolution) image is generated from the transformed gradient profiles, which is added as the image prior in HR image reconstruction model. Extensive experiments are conducted to evaluate the proposed algorithm in subjective visual effect, objective quality, and computation time. The experimental results demonstrate that the proposed approach can generate superior HR images with better visual quality, lower reconstruction error and acceptable computation efficiency as compared to state-of-the-art works.

Learning Person–Person Interaction in Collective Activity Recognition

Collective activity is a collection of atomic activities (individual person's activity) and can hardly be distinguished by an atomic activity in isolation. The interactions among people are important

cues for recognizing collective activity. In this paper, we concentrate on modeling the person-person interactions for collective activity recognition. Rather than relying on hand-craft description of the person-person interaction, we propose a novel learning-based approach that is capable of computing the class-specific person-person interaction patterns. In particular, we model each class of collective activity by an interaction matrix, which is designed to measure the connection between any pair of atomic activities in a collective activity instance. We then formulate an interaction response (IR) model by assembling all these measurements and make the IR class specific and distinct from each other. A multitask IR is further proposed to jointly learn different person-person interaction patterns simultaneously in order to learn the relation between different person-person interactions and keep more distinct activity-specific factor for each interaction at the same time. Our model is able to exploit discriminative low-rank representation of person-person interaction. Experimental results on two challenging data sets demonstrate our proposed model is comparable with the state-of-the-art models and show that learning person-person interactions plays a critical role in collective activity recognition.

Background Subtraction Based on Low-Rank and Structured Sparse Decomposition

Low rank and sparse representation based methods, which make few specific assumptions about the background, have recently attracted wide attention in background modeling. With these methods, moving objects in the scene are modeled as pixel-wised sparse outliers. However, in many practical scenarios, the distributions of these moving parts are not truly pixel-wised sparse but structurally sparse. Meanwhile a robust analysis mechanism is required to handle background regions or foreground movements with varying scales. Based on these two observations, we first introduce a class of structured sparsity-inducing norms to model moving objects in videos. In our approach, we regard the observed sequence as being constituted of two terms, a low-rank matrix (background) and a structured sparse outlier matrix (foreground). Next, in virtue of adaptive parameters for dynamic videos, we propose a saliency measurement to dynamically estimate the support of the foreground. Experiments on challenging well known data sets demonstrate that the proposed approach outperforms the state-of-the-art methods and works effectively on a wide range of complex videos.