Deep Learning on Computer Vision...

National Taiwan University

Dept. of EE, Research Assistant

Yen-Cheng Liu

Deep Learning on Computer VisionApplications

Outline

• Convolution Neural Network (CNN)

• Applications• Super-resolution

• Compressed Artifact Reduction

• Reconstruct Compressed Image

• Context Inpainting

• Colorization

• Sketch Simplification• Style Transfer

• Depth Estimation

• Semantic Segmentation

• Standard Model

• Image-to-Image Model

Outline

• Colorization

• Standard Model

No Math in this tutorial

How I feel when no math

appear ing in a paper

Convolutional Neural Networkn CNN History

• 1990s, CNN used to be the dominant tool, but then fell out of fashion, particularly in computer

vision, with the rise of support vector machines(SVM).

• In 2012, CNN has become popular again due to significant success on the ILSVRC

n Standard structure

Convolution layers and pooling layers Fully connected layers 3

Convolutional Neural Networkn Component

• Convolution layers, Pooling layers and Fully connected layers

• Purpose: originally for classification (i.e. LeNet)

Convolution Layer

Image Credit: Stanford CS231n

Pooling Layer Fully-Connected Layer

Input Feature Map Filters

Output Feature Map

Standard CNN Model

Convolution layers and pooling layers Fully connected layers

Input Output

What you did yesterday……

Ground Truth Label

+Human Machine

What you did yesterday……

“9” “5” “2” ”7”

Recognition

Output

Standard CNN Model- Example

Object Datasets

InputOutput

“Dog”

Object

Recognition

Face Datasets

InputOutput

“柯P”

Recognition

某Datasets

+ Label

某Recognition

某Datasets

+ Label

某Recognition

某Datasets

+ Label

某Recognition

Jonathan Long Evan Shelhamer Trevor Darrell

(from UC Berkley)

Image-to-Image Model

Input Output

Convolution layers Fully connected layers

Input Output

OutputUp-sampling

Input Output

Output

Input Output

Applications

Super-resolution

Image Credit: Wei-Sheng Lai @ UC Merced

Super-resolution

Super-resolution- Input : Low-resolution image Y

- Output : High-resolution image F(Y)

Image Credit: Wei-Sheng Lai @ UC Merced

[1] “Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution”, Lai et al., CVPR ‘17

[2] “Image super-resolution using deep convolutional networks”, Dong et al., TPAMI ‘17

[3] “Accelerating the super-resolution convolutional neural network”, Dong et al., ECCV ‘16

[5] “Deeply-recursive convolutional network for image super-resolution”, Kim et al., CVPR ‘16 [4] “Accurate image super-resolution using very deep convolutional network”, Kim et al., CVPR ‘16

[6] Learning a Deep Convolutional Network for Image Super-Resolution, Dong et al., ECCV ‘14

Super-resolution

Reconstructing Compressed Image• Kulkarni et al.[4] present a non-iterative and extremely fast algorithm to

reconstruct images from compressively sensed (CS) random measurements

21[7] “ReconNet: Non-Iterative Reconstruction of Images from Compressively Sensed Random Measurements”, Kulkarni et al., CVPR ‘16

Reconstructing Compressed Image

22[7] “ReconNet: Non-Iterative Reconstruction of Images from Compressively Sensed Random Measurements”, Kulkarni et al., CVPR ‘16

Image Artifacts Removal• Compressed artifacts can be removed by using ARCNN [8]

23[8] “Compression Artifacts Reduction by a Deep Convolutional Network”, Dong et al., ICCV ‘15

Image Artifacts Removal

[8] “Compression Artifacts Reduction by a Deep Convolutional Network”, Dong et al., ICCV ‘15 24

Context Inpainting• Pathak et al.[9] generate the contexts of an arbitrary image region

conditioned on its surroundings using CNN

25[9] ”Context Encoders: Feature Learning by Inpainting”, Pathak et al., CVPR ‘16.

Context Inpainting• Pathak et al.[9] generate the contexts of an arbitrary image region

conditioned on its surroundings using Generative Adversarial Net

Context Inpainting

Face Inpainting• Input: Corrupted Facial Image• Output: Complete Facial Image

28[10] ”Generative Face Completion”, Li et al., CVPR ‘17.

[11] “DeMeshNet: Blind Face Inpainting for Deep MeshFace Verification”, Zhang et al., CVPR ‘17

OutputInput Input Output

Face Rotation• Input: Facial Image• Output: Facial Image with given angle

29[12] ” Rotating Your Face Using Multi-task Deep Neural Network ”,Yim et al., CVPR ‘15.

[13] “Disentangled Representation Learning GAN for Pose-Invariant Face Recognition”, Trum et al., CVPR ‘17

Attribute Manipulation

30[14] StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Colorization

Colorization• Cheng et al.[14] investigates into the colorization problem which converts a

grayscale image to a colorful one

32[15] “Deep Colorization”, Cheng et al., ICCV ‘15.

Colorization• Cheng et al.[14] investigates into the colorization problem which converts a

grayscale image to a colorful version

33[15] “Deep Colorization”, Cheng et al., ICCV ‘15.

Colorization• Satoshi et al.[16] propose a technique to automatically colorize grayscale

images that combines both global priors and local image features .

Global image priors are extracted from entire image

Local image feature are computed from small image pattern

[16] “Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with SimultaneousClassification”, Iizuka et al., SIGGRAPH ‘16

Colorization• Satoshi et al.[16] propose a technique to automatically colorize grayscale

images that combines both global priors and local image features .

[16] “Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with SimultaneousClassification”, Iizuka et al., SIGGRAPH ‘16

Sketch Simplification• Simo-Serra et al.[17] propose CNN structure to simplify sketch drawings

• This architecture can process any resolution due to Fully Convolutional

Neural Network

Input Image Output Image[17] Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup, Simo-Serra et al., SIGGRAPH ‘16 36

Sketch Simplification• Simo-Serra et al.[17] propose CNN structure to simplify sketch drawings

More challenging input rough raster image, instead of vector image

[17] Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup, Simo-Serra et al., SIGGRAPH ‘16 37

Sketch-to-Photo Inversion

[18] “Scribbler: Controlling Deep Image Synthesis with Sketch and Color”, Sangkloy et al. , CVPR 2017

Output

Attribute Manipulation + Style Transfer

[19] “Detach and Adapt: Learning Cross-Domain Disentangled Deep Representation”, CVPR ’1839

No Label Supervision

Output

Outline

• Colorization

• Standard Model

Outline

• Colorization

• Standard Model

• Gatys et al.[20] propose a system which use neural representation to

separate and recombine content and style of arbitrary images

Artistic Image Style Transfer

[20] “A Neural Algorithm of Artistic Style”, Gatys et al., CVPR ‘1641

• Based on VGG-19 framework

• Extract the feature map of single photo and artwork to generate the image which mixcontent and style

VGG-19

Content / Style

Representation

[20] “A Neural Algorithm of Artistic Style”, Gatys et al., CVPR ‘1642

• Based on VGG-19 framework

• Extract the feature map of single photo and artwork to generate the image which mixcontent and style

43[20] “A Neural Algorithm of Artistic Style”, Gatys et al., CVPR ‘16

• Feed-forward CNN Model (1000x faster than Gatys et al.)

• Loss Network is based on Pre-trained VGG-19 framework

[21] "Perceptual losses for real-time style transfer and super-resolution.“, Johnson et al., ECCV ‘16.

Artistic Video Style Transfer

https://www.youtube.com/watch?v=Khuj4ASldmU

45[22] "Artistic style transfer for videos.“, Ruder et al., arXiv ‘16.

Artistic 360 Video Style Transfer

https://www.youtube.com/watch?v=pkgMUfNeUCQ

[23] " Artistic style transfer for videos and spherical images.“, Ruder et al., arXiv ‘17.

Deep Photo Enhancer

47[24] "Deep Photo Enhancer: Unpaired Learning for Image Enhancement from Photographs with GANs.“, Chen et al., CVPR ‘18

Deep Photo Enhancer

47[24] "Deep Photo Enhancer: Unpaired Learning for Image Enhancement from Photographs with GANs.“, Chen et al., CVPR ‘18

Single Image Depth Estimation

• Depth Estimation CNN– Input: RGB image

– Output: Depth/Disparity Estimation

48[25] “Unsupervised Monocular Depth Estimation with Left-Right Consistency”, Godard et al., CVPR ’17

[26] “Unsupervised Learning of Depth and Ego-motion from Video”, Zhou et al., CVPR ‘17

[27] “Semi-Supervised Deep Learning for Monocular Depth Map Prediction”, Kuznietsov et al., CVPR ‘17

Single Image Depth Estimation

• Depth Estimation CNN– Input: RGB image

– Output: Depth/Disparity Estimation

Semantic Segmentation

• Semantic Segmentation CNN (Pixel-wise classification)

– Input: RGB image

– Output: Pixel-wise classes prediction

49[28] “Fully Convolutional Networks for Semantic Segmentation”, Long et al., CVPR ’15

[29] “Pyramid Scene Parsing Network”, Zhao et al., CVPR ‘17

https://www.youtube.com/watch?v=qWl9idsCuLQ

Today’s Practice!

Style Transfer + Semantic Segmentation• Champandard [30] introduce a novel concept to augment artistic style

algorithm with semantic annotation

Doodle by Human Result

[30] Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks, Champandard et al., arXiv, Mar, 201651

by CNN or HumanPainting

Summary

Today’s Presentation

Sty le Trans fe r

S e m a n t i c S e g m e n t a t i o n

Depth Estimation

Photo EnhancerAttribute Manipulation

S u p e r -Reso lu t ion

Colorization

I m a g e Art i facts R e m o v a l

C o n t e x t I n p a i n t i n g

Summary

Computer Vision

Today’s Presentation

Conclusion• Research areas including computer vision, image processing and computer graphics

have a great success based on Deep Learning

• Convolution Neural Network is still evolving and continually achieve magical

performance• Unsupervised learning

• Meta learning (Learning to learn)

• Explanation of neural network

Deep Learning on Computer Vision...

Documents

Deep Generative Image Models using a Laplacian …papers.nips.cc/paper/5773-deep-generative-image-models...Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks

Towards a Theoretical Foundation for Laplacian-Based ...web.cse.ohio-state.edu/~belkin.8/papers/TT_JCSS_08.pdf · Laplacian to the manifold Laplacian in the context of machine learning

Local Laplacian Filters: Edge-aware Image …people.csail.mit.edu/.../ParisEtAl11-lapfilters-lowres.pdfLocal Laplacian Filters: Edge-aware Image Processing with a Laplacian Pyramid

Db crash_course-normalization

GLEE: Geometric Laplacian Eigenmap Embedding · GLEE has deep connections with the so-called simplex geometry of the Laplacian [12, 15]. Fiedler [15] first made this observation,

Deep Generative Image Models using a Laplacian Pyramid ...duvenaud/courses/csc2541/slides/gan... · Motivation Current deep ... Learn feature representations of images & text

Guia Thesis Complex Laplacian

Laplacian Stencil Application

Laplacian Patch-Based Image Synthesis - KAISTvclab.kaist.ac.kr/cvpr2016p2/CVPR2016_LaplacianInpainting_supp.pdf · Laplacian Patch-Based Image Synthesis ... with Other Laplacian-Based

DEEP LAPLACIAN PYRAMID NETWORK FOR TEXT IMAGES SUPER-RESOLUTION … · 2018-11-27 · DEEP LAPLACIAN PYRAMID NETWORK FOR TEXT IMAGES SUPER-RESOLUTION Hanh T. M. Tran, Tien Ho-Phuoc

Discretization of Laplacian Operator

Signed Laplacian for Spectral Clustering Revisited · Signed Laplacian for spectral clustering revisited 3 Let us introduce the Laplacian L= D Wand the normalized Laplacian D 1L=

Laplacian Dynamics on General Graphs - Harvard University · Laplacian Dynamics on General Graphs Laplacian matrices were ﬁrst introduced by Gustav Kirchhoff in his pioneering study

Laplacian Paradigm 2 - sachdevasushant.github.io · Laplacian Paradigm 2.0 8:40-9:10: Merging Continuous and Discrete(Richard Peng) 9:10-9:50: Beyond Laplacian Solvers (Aaron Sidford)

Laplacian Matrices of Graph

The Adjacency Matrix, Standard Laplacian, and · PDF fileThe Adjacency Matrix, Standard Laplacian, and Normalized Laplacian, and Some Eigenvalue Interlacing Results ... Then, using

Deep Graph Laplacian Regularization for Robust Denoising ...openaccess.thecvf.com/content_CVPRW_2019/papers/...Recent developments in deep learning have revolution-ized the aforementioned

Discrete Laplacian

Pan-Sharpening With a Hyper-Laplacian Penalty · Gaussian, RMSE = 2.385 Laplacian,RMSE=1.58 Hyper−Laplacian,RMSE=0.961 Figure 3. Fitting curves to empirical image gradient data

An Adapted Laplacian Operator For Hybrid Quad… · An Adapted Laplacian Operator For Hybrid Quad ... An Adapted Laplacian Operator For Hybrid Quad/Triangle ... The discrete versions