32
Image-to-Image Translation with Conditional Adversarial Nets (Pix2Pix) & Perceptual Adversarial Networks for Image-to-Image Transformation (PAN) 2017/10/2 DLHacks Otsubo

[DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Embed Size (px)

Citation preview

Page 1: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Image-to-Image Translation with Conditional Adversarial Nets (Pix2Pix)

& Perceptual Adversarial Networks for

Image-to-Image Transformation (PAN)

2017/10/2 DLHacks Otsubo

Page 2: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Topic : image-to-image “translation”

1

Page 3: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Info

Pix2Pix [CVPR2017] •  Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros

-  iGAN [ECCV 2016] -  interactive-deep-colorization [SIGGRAPH 2017] -  Context-Encoder [CVPR 2016] -  Image Quilting [SIGGRAPH 2001] -  Texture Synthesis by Non-parametric Sampling [ICCV 1999]

•  University of California •  178 citations

PAN [arXiv2017] •  Chaoyue Wang, Chang Xu, Chaohui Wang, Dacheng Tao •  University of Technology Sydney, The University of Sydney,

Universite Paris-Est

2

Page 4: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Background

•  Many tasks are regarded as “translation” from input image to output image -  Diverse methods exist for them

3

Istheresingleframeworktoachievethem?

Page 5: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Overview

Pix2Pix •  General-purpose solution to image-to-image

translation using single framework -  Single framework: conditional GAN (cGAN)

PAN •  Pix2Pix - (per-pixel loss)

+ (perceptual adversarial loss)

4

Page 6: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Naive Implementation : U-Net (①)

5

①per-pixel loss (L1/L2)

Page 7: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Pix2Pix (①+②)

6

②adversarial loss

Page 8: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Pix2Pix’s loss (①+②)

7

Page 9: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

PAN (②+③)

8

③perceptual adversarial loss

Page 10: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

PAN’s loss (②+③)

9

L1 norm

m : constant

Page 11: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Example1 : Image De-Raining

•  Removing rain from single images via a deep detail network [Fu, CVPR2017]

•  ID-GAN (cGAN) [Zhang, arXiv2017] -  per-pixel loss -  adversarial loss -  pre-trained VGG’s

perceptual loss

10

Input Output (Ground Truth)

Page 12: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Example1 : Image De-Raining

•  Removing rain from single images via a deep detail network [Fu, CVPR2017]

•  ID-GAN (cGAN) [Zhang, arXiv2017] -  per-pixel loss -  adversarial loss -  pre-trained VGG’s

perceptual loss

11

Input Output (Ground Truth)

(cf. PAN uses discriminator’s perceptual loss)

Page 13: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Example2 : Image Inpainting

•  Globally and Locally Consistent Image Completion [Iizuka, SIGGRAPH2017]

•  Context Encoders (cGAN) [Pathak, CVPR2016] -  per-pixel loss -  adversarial loss

12

Input Output (Ground Truth)

Page 14: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Example3 : Semantic Segmentation

Cityscape / Pascal VOC •  DeepLabv3 [Chen, arXiv2017] •  PSPNet [Zhao, CVPR2017]

http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?cls=mean&challengeid=11&compid=6

Cell Tracking / CREMI •  Learned Watershed

[Wolf, ICCV2017] •  U-Net

[Ronneberger, MICCAI2015] http://www.codesolorzano.com/Challenges/CTC/Welcome.html

13

Input Output (Ground Truth)

Page 15: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Result1 : Image De-Raining

14

(≒pix2pix)→

(≒pix2pix)

Page 16: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Result2 : Image Inpainting

15

Page 17: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Result3 : Semantic Segmentation

16

Page 18: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Discussion

vs. No perceptual loss (Pix2Pix) -  Perceptual loss enables D to detect more

discrepancy between True/False images vs. Pre-trained VGG perceptual loss (ID-GAN)

-  VGG features tend to focus on content -  PAN features tend to focus on discrepancy -  PAN’s loss leads to avoid adversarial

examples [Goodfellow, ICLR2015] (?)

17

Why is perceptual adversarial loss so efficient?

Page 19: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Minor Difference

•  Pix2Pix uses Patch-GAN -  Small size(70×70) patch-discriminator -  Final output of D is average of

patch-discriminator’s responses (convolutionally applied)

18

Page 20: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

To Do

•  Implement 1.  Pix2Pix (Patch Discriminator) 2.  PAN (Patch Discriminator) 3.  PAN (Normal Discriminator) Wang et al. might compare 1 with 3.

19

Page 21: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

20

Page 22: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Implementation

2017/10/17 DLHacks Otsubo

Page 23: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

My Implementation

•  https://github.com/DLHacks/pix2pix_PAN

•  pix2pix - https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix

•  PAN -  per-pixel loss à perceptual adversarial loss -  not same as paper’s original architecture -  num of parameters is same as pix2pix

22

Page 24: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

My Experiments

•  Facade (label à picture) •  Map (picture à Google map) •  Cityscape (picture à label)

23

Page 25: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Result (Facade pix2pix)

24

Page 26: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Result (Facade PAN)

25

Page 27: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Result (Map pix2pix)

26

Page 28: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Result (Map PAN)

27

Page 29: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Result (Cityscape pix2pix)

28

Page 30: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Result (Cityscape PAN)

29

Page 31: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Result (PSNR[dB])

30

Page 32: [DLHacks 実装]Perceptual Adversarial Networks for Image-to-Image Transformation

Discussion – Why pix2pix > PAN?

•  per-pixel loss is needed? •  patch discriminator is not suited for PAN? •  positive margin m

•  (bad pix2pix implementation in PAN’s paper…?)

31