71
d National Taiwan University Dept. of EE, Research Assistant Yen-Cheng Liu Deep Learning on Computer Vision Applications

Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

d

National Taiwan University

Dept. of EE, Research Assistant

Yen-Cheng Liu

Deep Learning on Computer VisionApplications

Page 2: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Outline

• Convolution Neural Network (CNN)

• Applications• Super-resolution

• Compressed Artifact Reduction

• Reconstruct Compressed Image

• Context Inpainting

• Colorization

• Sketch Simplification• Style Transfer

1

• Depth Estimation

• Semantic Segmentation

• Standard Model

• Image-to-Image Model

Page 3: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Outline

• Convolution Neural Network (CNN)

• Applications• Super-resolution

• Compressed Artifact Reduction

• Reconstruct Compressed Image

• Context Inpainting

• Colorization

• Sketch Simplification• Style Transfer

1

• Depth Estimation

• Semantic Segmentation

• Standard Model

• Image-to-Image Model

Page 4: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

2

No Math in this tutorial

Page 5: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

No Math in this tutorial

2

How I feel when no math

appear ing in a paper

Page 6: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Convolutional Neural Networkn CNN History

• 1990s, CNN used to be the dominant tool, but then fell out of fashion, particularly in computer

vision, with the rise of support vector machines(SVM).

• In 2012, CNN has become popular again due to significant success on the ILSVRC

n Standard structure

Convolution layers and pooling layers Fully connected layers 3

Page 7: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Convolutional Neural Networkn Component

• Convolution layers, Pooling layers and Fully connected layers

• Purpose: originally for classification (i.e. LeNet)

4

Convolution Layer

Image Credit: Stanford CS231n

Pooling Layer Fully-Connected Layer

Input Feature Map Filters

Output Feature Map

Page 8: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Standard CNN Model

5

Convolution layers and pooling layers Fully connected layers

Input Output

Page 9: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

6

What you did yesterday……

Ground Truth Label

+

Page 10: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

7

Ground Truth Label

+Human Machine

What you did yesterday……

Page 11: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

What you did yesterday……

Convolution layers and pooling layers Fully connected layers

8

Input

“9” “5” “2” ”7”

Digit

Recognition

Output

Page 12: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Standard CNN Model- Example

9

Object Datasets

CNN

Page 13: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Convolution layers and pooling layers Fully connected layers

10

InputOutput

“Dog”

Object

Recognition

Standard CNN Model- Example

Page 14: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Standard CNN Model- Example

11

Face Datasets

CNN

Page 15: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Convolution layers and pooling layers Fully connected layers

12

InputOutput

“柯P”

Face

Recognition

Standard CNN Model- Example

Page 16: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Standard CNN Model- Example

13

某Datasets

+ Label

CNN

某Recognition

Page 17: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Standard CNN Model- Example

某Datasets

+ Label

CNN

某Recognition

13

Page 18: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Standard CNN Model- Example

14

某Datasets

+ Label

CNN

某Recognition

Jonathan Long Evan Shelhamer Trevor Darrell

(from UC Berkley)

Page 19: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Image-to-Image Model

Convolution layers and pooling layers Fully connected layers

15

Input Output

Page 20: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Image-to-Image Model

Convolution layers Fully connected layers

15

Input Output

Page 21: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Image-to-Image Model

Convolution layers Fully connected layers

15

Input Output

Input

Page 22: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Image-to-Image Model

Convolution layers Fully connected layers

Input Output

Image

Input

Image

OutputUp-sampling

15

Page 23: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Image-to-Image Model

Convolution layers Fully connected layers

Input Output

Image

Input

Image

Output

15

Page 24: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Image-to-Image Model

Convolution layers Fully connected layers

Input Output

Input Output

15

Page 25: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Applications

16

Page 26: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Super-resolution

17

Page 27: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Image Credit: Wei-Sheng Lai @ UC Merced

Super-resolution

18

Page 28: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Super-resolution- Input : Low-resolution image Y

- Output : High-resolution image F(Y)

Image Credit: Wei-Sheng Lai @ UC Merced

[1] “Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution”, Lai et al., CVPR ‘17

[2] “Image super-resolution using deep convolutional networks”, Dong et al., TPAMI ‘17

[3] “Accelerating the super-resolution convolutional neural network”, Dong et al., ECCV ‘16

[5] “Deeply-recursive convolutional network for image super-resolution”, Kim et al., CVPR ‘16 [4] “Accurate image super-resolution using very deep convolutional network”, Kim et al., CVPR ‘16

[1]

[2]

[3]

[4]

[5]

19

Page 29: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

[6] Learning a Deep Convolutional Network for Image Super-Resolution, Dong et al., ECCV ‘14

Super-resolution

20

Page 30: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Reconstructing Compressed Image• Kulkarni et al.[4] present a non-iterative and extremely fast algorithm to

reconstruct images from compressively sensed (CS) random measurements

21[7] “ReconNet: Non-Iterative Reconstruction of Images from Compressively Sensed Random Measurements”, Kulkarni et al., CVPR ‘16

Page 31: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Reconstructing Compressed Image

22[7] “ReconNet: Non-Iterative Reconstruction of Images from Compressively Sensed Random Measurements”, Kulkarni et al., CVPR ‘16

Page 32: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Image Artifacts Removal• Compressed artifacts can be removed by using ARCNN [8]

23[8] “Compression Artifacts Reduction by a Deep Convolutional Network”, Dong et al., ICCV ‘15

Page 33: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Image Artifacts Removal

[8] “Compression Artifacts Reduction by a Deep Convolutional Network”, Dong et al., ICCV ‘15 24

Page 34: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Context Inpainting• Pathak et al.[9] generate the contexts of an arbitrary image region

conditioned on its surroundings using CNN

25[9] ”Context Encoders: Feature Learning by Inpainting”, Pathak et al., CVPR ‘16.

Page 35: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Context Inpainting• Pathak et al.[9] generate the contexts of an arbitrary image region

conditioned on its surroundings using Generative Adversarial Net

26[9] ”Context Encoders: Feature Learning by Inpainting”, Pathak et al., CVPR ‘16.

Page 36: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Context Inpainting

27[9] ”Context Encoders: Feature Learning by Inpainting”, Pathak et al., CVPR ‘16.

Page 37: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Face Inpainting• Input: Corrupted Facial Image• Output: Complete Facial Image

28[10] ”Generative Face Completion”, Li et al., CVPR ‘17.

[11] “DeMeshNet: Blind Face Inpainting for Deep MeshFace Verification”, Zhang et al., CVPR ‘17

OutputInput Input Output

Page 38: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Face Rotation• Input: Facial Image• Output: Facial Image with given angle

29[12] ” Rotating Your Face Using Multi-task Deep Neural Network ”,Yim et al., CVPR ‘15.

[13] “Disentangled Representation Learning GAN for Pose-Invariant Face Recognition”, Trum et al., CVPR ‘17

Page 39: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Attribute Manipulation

30[14] StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Page 40: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Colorization

31

Page 41: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Colorization• Cheng et al.[14] investigates into the colorization problem which converts a

grayscale image to a colorful one

32[15] “Deep Colorization”, Cheng et al., ICCV ‘15.

Page 42: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Colorization• Cheng et al.[14] investigates into the colorization problem which converts a

grayscale image to a colorful version

33[15] “Deep Colorization”, Cheng et al., ICCV ‘15.

Page 43: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Colorization• Satoshi et al.[16] propose a technique to automatically colorize grayscale

images that combines both global priors and local image features .

Global image priors are extracted from entire image

Local image feature are computed from small image pattern

[16] “Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with SimultaneousClassification”, Iizuka et al., SIGGRAPH ‘16

34

Page 44: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Colorization• Satoshi et al.[16] propose a technique to automatically colorize grayscale

images that combines both global priors and local image features .

[16] “Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with SimultaneousClassification”, Iizuka et al., SIGGRAPH ‘16

35

Page 45: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Sketch Simplification• Simo-Serra et al.[17] propose CNN structure to simplify sketch drawings

• This architecture can process any resolution due to Fully Convolutional

Neural Network

Input Image Output Image[17] Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup, Simo-Serra et al., SIGGRAPH ‘16 36

Page 46: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Sketch Simplification• Simo-Serra et al.[17] propose CNN structure to simplify sketch drawings

More challenging input rough raster image, instead of vector image

[17] Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup, Simo-Serra et al., SIGGRAPH ‘16 37

Page 47: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Sketch-to-Photo Inversion

[18] “Scribbler: Controlling Deep Image Synthesis with Sketch and Color”, Sangkloy et al. , CVPR 2017

Input

Output

38

Page 48: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Attribute Manipulation + Style Transfer

[19] “Detach and Adapt: Learning Cross-Domain Disentangled Deep Representation”, CVPR ’1839

No Label Supervision

Input

Output

Page 49: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Outline

• Convolution Neural Network (CNN)

• Applications• Super-resolution

• Compressed Artifact Reduction

• Reconstruct Compressed Image

• Context Inpainting

• Colorization

• Sketch Simplification• Style Transfer

40

• Depth Estimation

• Semantic Segmentation

• Standard Model

• Image-to-Image Model

Page 50: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Outline

• Convolution Neural Network (CNN)

• Applications• Super-resolution

• Compressed Artifact Reduction

• Reconstruct Compressed Image

• Context Inpainting

• Colorization

• Sketch Simplification• Style Transfer

• Depth Estimation

• Semantic Segmentation

• Standard Model

• Image-to-Image Model

40

Page 51: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

• Gatys et al.[20] propose a system which use neural representation to

separate and recombine content and style of arbitrary images

Artistic Image Style Transfer

[20] “A Neural Algorithm of Artistic Style”, Gatys et al., CVPR ‘1641

Page 52: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Artistic Image Style Transfer

• Based on VGG-19 framework

• Extract the feature map of single photo and artwork to generate the image which mixcontent and style

VGG-19

Content / Style

Representation

[20] “A Neural Algorithm of Artistic Style”, Gatys et al., CVPR ‘1642

Page 53: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Artistic Image Style Transfer

• Based on VGG-19 framework

• Extract the feature map of single photo and artwork to generate the image which mixcontent and style

43[20] “A Neural Algorithm of Artistic Style”, Gatys et al., CVPR ‘16

Page 54: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Artistic Image Style Transfer

• Feed-forward CNN Model (1000x faster than Gatys et al.)

44

• Loss Network is based on Pre-trained VGG-19 framework

[21] "Perceptual losses for real-time style transfer and super-resolution.“, Johnson et al., ECCV ‘16.

Page 55: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Artistic Video Style Transfer

https://www.youtube.com/watch?v=Khuj4ASldmU

45[22] "Artistic style transfer for videos.“, Ruder et al., arXiv ‘16.

Page 56: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Artistic 360 Video Style Transfer

46

https://www.youtube.com/watch?v=pkgMUfNeUCQ

[23] " Artistic style transfer for videos and spherical images.“, Ruder et al., arXiv ‘17.

Page 57: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Deep Photo Enhancer

47[24] "Deep Photo Enhancer: Unpaired Learning for Image Enhancement from Photographs with GANs.“, Chen et al., CVPR ‘18

Page 58: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Deep Photo Enhancer

47[24] "Deep Photo Enhancer: Unpaired Learning for Image Enhancement from Photographs with GANs.“, Chen et al., CVPR ‘18

Page 59: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Single Image Depth Estimation

• Depth Estimation CNN– Input: RGB image

– Output: Depth/Disparity Estimation

48[25] “Unsupervised Monocular Depth Estimation with Left-Right Consistency”, Godard et al., CVPR ’17

[26] “Unsupervised Learning of Depth and Ego-motion from Video”, Zhou et al., CVPR ‘17

[27] “Semi-Supervised Deep Learning for Monocular Depth Map Prediction”, Kuznietsov et al., CVPR ‘17

Page 60: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Single Image Depth Estimation

• Depth Estimation CNN– Input: RGB image

– Output: Depth/Disparity Estimation

48

Page 61: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Semantic Segmentation

• Semantic Segmentation CNN (Pixel-wise classification)

– Input: RGB image

– Output: Pixel-wise classes prediction

49[28] “Fully Convolutional Networks for Semantic Segmentation”, Long et al., CVPR ’15

[29] “Pyramid Scene Parsing Network”, Zhao et al., CVPR ‘17

Page 62: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Semantic Segmentation

• Semantic Segmentation CNN (Pixel-wise classification)

– Input: RGB image

– Output: Pixel-wise classes prediction

50

https://www.youtube.com/watch?v=qWl9idsCuLQ

Page 63: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Semantic Segmentation

• Semantic Segmentation CNN (Pixel-wise classification)

– Input: RGB image

– Output: Pixel-wise classes prediction

50

Today’s Practice!

Page 64: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Semantic Segmentation

• Semantic Segmentation CNN (Pixel-wise classification)

– Input: RGB image

– Output: Pixel-wise classes prediction

50

Today’s Practice!

Page 65: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Style Transfer + Semantic Segmentation• Champandard [30] introduce a novel concept to augment artistic style

algorithm with semantic annotation

Doodle by Human Result

[30] Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks, Champandard et al., arXiv, Mar, 201651

Page 66: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Style Transfer + Semantic Segmentation• Champandard [11] introduce a novel concept to augment artistic style

algorithm with semantic annotation

51

Semantic Segmentation

by CNN or HumanPainting

[30] Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks, Champandard et al., arXiv, Mar, 2016

Page 67: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Style Transfer + Semantic Segmentation• Champandard [11] introduce a novel concept to augment artistic style

algorithm with semantic annotation

51

Semantic Segmentation

by CNN or HumanPainting

[30] Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks, Champandard et al., arXiv, Mar, 2016

Page 68: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Summary

52

Today’s Presentation

Sty le Trans fe r

S e m a n t i c S e g m e n t a t i o n

Depth Estimation

Photo EnhancerAttribute Manipulation

S u p e r -Reso lu t ion

Colorization

I m a g e Art i facts R e m o v a l

C o n t e x t I n p a i n t i n g

Page 69: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Summary

53

Computer Vision

Today’s Presentation

Page 70: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

54

Page 71: Deep Learning on Computer Vision Applicationsmedia.ee.ntu.edu.tw/crash_course/2018/dl/dlcv_app.pdf · Image Credit: Wei-Sheng Lai @ UC Merced [1] “Deep Laplacian Pyramid Networks

Conclusion• Research areas including computer vision, image processing and computer graphics

have a great success based on Deep Learning

• Convolution Neural Network is still evolving and continually achieve magical

performance• Unsupervised learning

• Meta learning (Learning to learn)

• Explanation of neural network

55