2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level...

Preview:

Citation preview

Institute of Electrical Measurement and Measurement Signal Processing

1

Axel Pinz WS 2017/18 Image and Video Understanding 4

2D Scene Representation and Description

You can get very far in 2D !

2D “image object”

“token”

“tokenset”

2D scene description

image

image

description

segmentation

2D grouping

Institute of Electrical Measurement and Measurement Signal Processing

2

Axel Pinz WS 2017/18 Image and Video Understanding 4

Segmentation: From Images to Tokens

• “Formal” definition of segmentation:Image I is segmented segmentation S = {regions Ri | rules 1-4}

1. ∪ 𝑅𝑖 = 𝐼

2. 𝑖 ≠ 𝑗: 𝑅𝑖 ∩ 𝑅𝑗 = ∅

3. some homogeneity criterion H holds: ∀𝑖: 𝐻 𝑅𝑖 = 𝑡𝑟𝑢𝑒

4. disjunct neighbors: 𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑟 𝑅𝑖 , 𝑅𝑗 : 𝐻 𝑅𝑖 ∪ 𝑅𝑗 = 𝑓𝑎𝑙𝑠𝑒

• Fundamental concepts of segmentation• Region-based segmentation (e.g. threshold, split+merge, R growing)

• Contour-based segmentation (e.g. edge detection closing of gaps)

• State-of-the-art segmentation algorithms• Graph based segmentation

• Level set

image

image

description

segmentation

2D grouping

Institute of Electrical Measurement and Measurement Signal Processing

3

Axel Pinz WS 2017/18 Image and Video Understanding 4

Segmenting 2D “Shapes”

Region-Based Segmentation (1)

e.g.:

Global threshold

Institute of Electrical Measurement and Measurement Signal Processing

4

Axel Pinz WS 2017/18 Image and Video Understanding 4

Region-Based Segmentation (2)

e.g.: Region growing Seed cells

Institute of Electrical Measurement and Measurement Signal Processing

5

Axel Pinz WS 2017/18 Image and Video Understanding 4

Region-Based Segmentation (3)

e.g.: Split and Merge

Image

Institute of Electrical Measurement and Measurement Signal Processing

6

Axel Pinz WS 2017/18 Image and Video Understanding 4

Edge-Based Segmentation (1)

General approach:

1) Smoothing

2) Edge detection

3) Filteringa) Eliminate short edges

b) Close small gaps

4) Obtain closed contours

Image Edge image

Institute of Electrical Measurement and Measurement Signal Processing

7

Axel Pinz WS 2017/18 Image and Video Understanding 4

Edge-Based Segmentation (2)

Real edge profile Real line profile

Edge: inflection point

Line: local extremum

Locate it:

Zero crossing of 1st / 2nd derivative !!

Noise smoothing needed convolution with a Gaussian!

Edge

Line

Institute of Electrical Measurement and Measurement Signal Processing

8

Axel Pinz WS 2017/18 Image and Video Understanding 4

Edge-Based Segmentation (3)

D. Marr + E. Hildreth: LoG / DoG zero crossings [1978]

“Mexican hat” / “Sombrero” Operator

http://laurent-duval.blogspot.co.at/2014/09/cours-radial-basis-functions.html

∆ 𝐺 ∗ 𝐼 = (∆𝐺) ∗ 𝐼

Institute of Electrical Measurement and Measurement Signal Processing

9

Axel Pinz WS 2017/18 Image and Video Understanding 4

Edge-Based Segmentation (4)

LoG / DoG zero crossings

always produce closed contours!

Original images

T … threshold

T … threshold

Canny edge

Detector

Institute of Electrical Measurement and Measurement Signal Processing

10

Axel Pinz WS 2017/18 Image and Video Understanding 4

Relaxing the formal definition: Superpixels

Compactness and/or size parameters control the segmentation

Homogeneity may be violated:

disjunct neighbors: 𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑟 𝑅𝑖 , 𝑅𝑗 : 𝐻 𝑅𝑖 ∪ 𝑅𝑗 = 𝑓𝑎𝑙𝑠𝑒… may be true!

Example: [Achanta et al., SLIC superpixels, PAMI 34(11):2274-2282, 2012]

Institute of Electrical Measurement and Measurement Signal Processing

11

Axel Pinz WS 2017/18 Image and Video Understanding 4

SLIC Examples [Achanta et al. PAMI ]

Spatial segmentations

into “superpixels”

Institute of Electrical Measurement and Measurement Signal Processing

12

Axel Pinz WS 2017/18 Image and Video Understanding 4

SLIC Examples [Achanta et al. PAMI ]

Spatio-temporal

segmentations

into “supervoxels”

Institute of Electrical Measurement and Measurement Signal Processing

13

Axel Pinz WS 2017/18 Image and Video Understanding 4

Temporal Superpixels TSPs [Chang et al CVPR‘13]

Institute of Electrical Measurement and Measurement Signal Processing

14

Axel Pinz WS 2017/18 Image and Video Understanding 4

High-Level Segmentation – The “Correct” one?

3 Regions24 Regions41 Regions

http://www.cs.berkeley.edu/projects/vision/grouping/segbench/

Institute of Electrical Measurement and Measurement Signal Processing

15

Axel Pinz WS 2017/18 Image and Video Understanding 4

State-of-the-Art Segmentation (1)

Graph Cut [Shi+Malik, PAMI 2000]

J. Shi, J. Malik, “Normalized Cuts and Image Segmentation”, PAMI 22(8):888-905, 2000

Fully connected graph (all pixels of the image !)

Weights wpq measure similarity between pixel p, q

Institute of Electrical Measurement and Measurement Signal Processing

16

Axel Pinz WS 2017/18 Image and Video Understanding 4

Graph cut (2)J. Shi, J. Malik, “Normalized Cuts and Image Segmentation”, PAMI 22(8):888-905, 2000

Partition the graph into subgraphs

Goal: High similarity within a subgraph

Low similarity between subgraphs

Institute of Electrical Measurement and Measurement Signal Processing

17

Axel Pinz WS 2017/18 Image and Video Understanding 4

N-Cut Results (1)

Institute of Electrical Measurement and Measurement Signal Processing

18

Axel Pinz WS 2017/18 Image and Video Understanding 4

N-Cut Results (2)

Institute of Electrical Measurement and Measurement Signal Processing

19

Axel Pinz WS 2017/18 Image and Video Understanding 4

State-of-the-Art Segmentation (2)

Graph Based [Felzenszwalb+Huttenlocher, IJCV 2004]P. Felzenszwalb, D. Huttenlocher, “Efficient, Graph-based Image Segmentation”, IJCV 59(2), 2004

Pairwise comparison of regions

Original image segmentation using segmentation using

“grid graph” “nearest neighbor graph”

Institute of Electrical Measurement and Measurement Signal Processing

20

Axel Pinz WS 2017/18 Image and Video Understanding 4

Graph Based (2) [Felzenszwalb+Huttenlocher, IJCV 2004]

P. Felzenszwalb, D. Huttenlocher, “Efficient, Graph-based Image Segmentation”, IJCV 59(2), 2004

Pairwise comparison of regions

Original image segmentation using segmentation using

“grid graph” “nearest neighbor graph”

Institute of Electrical Measurement and Measurement Signal Processing

21

Axel Pinz WS 2017/18 Image and Video Understanding 4

State-of-the-Art Segmentation (3)

Level-Set Based SegmentationJames Sethian (1999): Level Set & Fast Marching Methods, Cambridge.

Stan Osher & Ronald Fedkiw (2002): Level Set Methods and Dynamic Implicit Surfaces, Springer.

Stan Osher & Nikos Paragios (2003): Geometric Level Set in Imaging, Vision+Graphics, Springer.

Split the image into 2 regions, such that:

• Similarity between pixels of a region is maximized

• Contour length is minimized

e.g.: [Chan&Vese, 2001]

dH

dHufHufuuE

)(

)(1)()()(,, 22

Institute of Electrical Measurement and Measurement Signal Processing

22

Axel Pinz WS 2017/18 Image and Video Understanding 4

Level-Set Based Segmentation (2)

dH

dHufHufuuE

)(

)(1)()()(,, 22

u+,u- Average intensities fore- and background

Φ Level-Set function

Ω image domain

f image point (pixel)

H Heaviside step function

ν weight of regularisation term (contour)

contour

Minimize E w.r.t. , u+, u-, +, -

https://en.wikipedia.org/wiki/Heaviside_step_function

Institute of Electrical Measurement and Measurement Signal Processing

23

Axel Pinz WS 2017/18 Image and Video Understanding 4

Level-Set Based Segmentation (3)

Φ>0

Φ<0

Φ>0

Φ<0

Φ>0

Φ<0

Many adaptations, various formulations

e.g. [Paragios&Deriche], [Brox&Weickert], [Fussenegger]

Institute of Electrical Measurement and Measurement Signal Processing

24

Axel Pinz WS 2017/18 Image and Video Understanding 4

Level-Set Based Segmentation (4)

Multi-region Level-Set

Institute of Electrical Measurement and Measurement Signal Processing

25

Axel Pinz WS 2017/18 Image and Video Understanding 4

Tokenset: Relaxing the formal definition

1. ∪ 𝑅𝑖 = 𝐼

2. 𝑖 ≠ 𝑗: 𝑅𝑖 ∩ 𝑅𝑗 = ∅

3. some homogeneity criterion H holds: ∀𝑖: 𝐻 𝑅𝑖 = 𝑡𝑟𝑢𝑒

4. disjunct neighbors: 𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑟 𝑅𝑖 , 𝑅𝑗 : 𝐻 𝑅𝑖 ∪ 𝑅𝑗 = 𝑓𝑎𝑙𝑠𝑒

• Tokens may be post-processed (e.g. opening/closing)• #of holes, #of parts several regions!

• Tokens may be overlapping (at least: their bounding boxes)

• No need to cover the whole image!

image

image

description

segmentation

2D grouping

Institute of Electrical Measurement and Measurement Signal Processing

26

Axel Pinz WS 2017/18 Image and Video Understanding 4

Token – Image Object

• Points (x,y)

• Lines ((x1,y1),(x2,y2))

• Polylines/Chains ((x1,y1),(x2,y2), …,(xn,yn))

• Polygons ((x1,y1),(x2,y2), …,(xn,yn),(x1,y1))Squares, rectangles, circles, ellipses, … (parametrized closed contours)

• Constellations/Bitmaps

“Feret”-Box

Bounding box, aligned (x,y)

… foreground

… background

Institute of Electrical Measurement and Measurement Signal Processing

27

Axel Pinz WS 2017/18 Image and Video Understanding 4

Constellation Tokens

Institute of Electrical Measurement and Measurement Signal Processing

28

Axel Pinz WS 2017/18 Image and Video Understanding 4

Tokenset

A file – i.e., a list of tokens database, various indexingTo

ke

n T

yp

e

Lexicon

Data

Institute of Electrical Measurement and Measurement Signal Processing

29

Axel Pinz WS 2017/18 Image and Video Understanding 4

Tokenset 2D Image/Scene Description

“houses” [Matsuyama’90] “face” [Brunelli’92] “pedestrians” [Suzuki’90]

Institute of Electrical Measurement and Measurement Signal Processing

30

Axel Pinz WS 2017/18 Image and Video Understanding 4

Example: PhD Pinz (1988)

Finding trees in aerial images …

Original image Smoothed (conv. Gaussian) Local brightness maxima

Circles

circles trees

Spruce (Fichte) + Pine (Kiefer)

Institute of Electrical Measurement and Measurement Signal Processing

31

Axel Pinz WS 2017/18 Image and Video Understanding 4

Example: PhD Pinz (1988)

Finding trees in aerial images …

Original image Smoothed (conv. Gaussian) Local brightness maxima

Circles

circles trees

Spruce (Fichte) + Pine (Kiefer)

2D image 2D scene

description

image

image

description

image proc.

segmentation

2D grouping

2D scene

description

Institute of Electrical Measurement and Measurement Signal Processing

32

Axel Pinz WS 2017/18 Image and Video Understanding 4

2D Scene Representation and Description

You can get very far in 2D !

2D “image object”

“token”

“tokenset”

2D scene description

image

image

description

segmentation

2D grouping

Institute of Electrical Measurement and Measurement Signal Processing

33

Axel Pinz WS 2017/18 Image and Video Understanding 4

Some very fundamental questions:

• What characterizes an object ?

“objectness”

- compact, (self-)similar, distinct (color, texture), …

• Given an object, what characterizes its shape ?

• Maybe easier to answer: What is not shape?

- color

- texture

- size

?

Institute of Electrical Measurement and Measurement Signal Processing

34

Axel Pinz WS 2017/18 Image and Video Understanding 4

“Objectness”

[Alexe et al., PAMI 2012]

cf. “Region Proposal Network – RPN” as one component in ConvNets for Object

Detection (e.g., Faster R-CNN, CVPR’14, see https://arxiv.org/abs/1506.01497)

Institute of Electrical Measurement and Measurement Signal Processing

35

Axel Pinz WS 2017/18 Image and Video Understanding 4

Object ShapeHumans (again!) are very good

in characterizing shape!

Proportions !

Pablo Picasso, rites of spring,

from D.Marr, VISION, fig. 3-56 (a)

2D 3D

Institute of Electrical Measurement and Measurement Signal Processing

36

Axel Pinz WS 2017/18 Image and Video Understanding 4

Ideas open to further development

• Spatio-temporal shape (2D space + time)

• Motion patterns, trajectory space, …

• Spatio-temporal shape (3D space + time)

Institute of Electrical Measurement and Measurement Signal Processing

37

Axel Pinz WS 2017/18 Image and Video Understanding 4

Token Features for 2D Grouping

Shape• Minimum bounding rectangle (MBR)

• Best ellipse fit

• Aspect ratio (AR): |log(height/width)| = |log(width/height)|

• BR fill: % of foreground pixels in Feret box or in MBR

• Circumference

• Compactness = 𝑎𝑟𝑒𝑎

𝑐𝑖𝑟𝑐𝑢𝑚𝑓𝑒𝑟𝑒𝑛𝑐𝑒2

• Elongatedness = 1 −𝑚𝑖𝑛𝑜𝑟 𝑎𝑥𝑖𝑠

𝑚𝑎𝑗𝑜𝑟 𝑎𝑥𝑖𝑠of the best ellipse fit

Appearance• Color

• Texture

• Histograms of … local descriptors

Institute of Electrical Measurement and Measurement Signal Processing

38

Axel Pinz WS 2017/18 Image and Video Understanding 4

Homographies, Collineations, Perspective Transformations

vs.

Token shape, appearance, etc.

I borrow from „Bildgestützte Messverfahren“

(image-based measurement 2VO, 1LU) …

A hierarchy of transformations – hierarchy of geometries:

• Euclidean

• Similarity transformation

• Affine transformation

• Perspective transformation

3

2

1

333231

232221

131211

3

2

1

'

'

'

'

x

x

x

hhh

hhh

hhh

x

x

x

x

x

H

Institute of Electrical Measurement and Measurement Signal Processing

39

Axel Pinz WS 2017/18 Image and Video Understanding 4

A Hierarchy of Transformations / Geometries

100

2221

1211

y

x

taa

taa

100

2221

1211

y

x

tsrsr

tsrsr

333231

232221

131211

hhh

hhh

hhh

100

2221

1211

y

x

trr

trr

Projective

8dof

P

Affine

6dof

A

Similarity

4dof

S

Euclidean

3dof

E

In 2D, a square transforms to:

3

2

1

333231

232221

131211

3

2

1

'

'

'

'

x

x

x

hhh

hhh

hhh

x

x

x

x

x

H

Invariance of token features?

Institute of Electrical Measurement and Measurement Signal Processing

40

Axel Pinz WS 2017/18 Image and Video Understanding 4

Invariance of Token FeaturesRadiometric

transformation

E

Euclidean

S

Similarity

A

Affine

P

Projective

Feret box: AR

Feret: BR fill

MBR: AR

MBR fill

best ellipse

elongatedness

circumference

compactness

size (# pix.)

color

texture

# holes

# parts

Institute of Electrical Measurement and Measurement Signal Processing

41

Axel Pinz WS 2017/18 Image and Video Understanding 4

Examples 2D GroupingLarge tokens:

Original image 434 constellation tokens 9 tokens > 2000 pixels

Parallelism

Original image 26002 straight lines 94 lines with orientation -4

Institute of Electrical Measurement and Measurement Signal Processing

42

Axel Pinz WS 2017/18 Image and Video Understanding 4

Grouping – Example (KU/09)

edges circles (Hough)

coins

Institute of Electrical Measurement and Measurement Signal Processing

43

Axel Pinz WS 2017/18 Image and Video Understanding 4

Perceptual Grouping (Lowe’87)

input image projection of

3D wireframe model

successful matches

by perceptual grouping

Institute of Electrical Measurement and Measurement Signal Processing

44

Axel Pinz WS 2017/18 Image and Video Understanding 4

The Umass VISIONS SystemInterpretations of Massachusetts “road” scenes [Draper et al., IJCV 1989]

original image interpretation result interpretation key

Institute of Electrical Measurement and Measurement Signal Processing

45

Axel Pinz WS 2017/18 Image and Video Understanding 4

The Umass VISIONS SystemInterpretations of Massachusetts “road” scenes [Draper et al., IJCV 1989]

original image interpretation result interpretation key

Institute of Electrical Measurement and Measurement Signal Processing

46

Axel Pinz WS 2017/18 Image and Video Understanding 4

The Umass VISIONS SystemInterpretations of Massachusetts “house” scenes [Draper et al., IJCV 1989]

original image interpretation result interpretation key

Institute of Electrical Measurement and Measurement Signal Processing

47

Axel Pinz WS 2017/18 Image and Video Understanding 4

Bottom-Up vs. Top-Down Grouping

2D Models

• Chains of “edgels”, “ridgels”

• Hough transform:

complex patterns local maxima

Image space “Hough” space, bins

• Active contour models

• Shape priors

• Active shape models

Institute of Electrical Measurement and Measurement Signal Processing

48

Axel Pinz WS 2017/18 Image and Video Understanding 4

Bottom-Up vs. Top-Down Grouping

2D Models

Edges chains

Lines ridges

Lines valleys

Institute of Electrical Measurement and Measurement Signal Processing

49

Axel Pinz WS 2017/18 Image and Video Understanding 4

“Snakes” – Active Contour Models

Kass, Witkin, Terzopoulos, 1st ICCV, London, 1987

Image energies (greylevel, gradient, …), mechanical model (spring)

Active contours adaptation to subjective contours

Institute of Electrical Measurement and Measurement Signal Processing

50

Axel Pinz WS 2017/18 Image and Video Understanding 4

“Snakes” – Active Contour Models

Kass, Witkin, Terzopoulos, 1st ICCV, London, 1987

Tracking of moving contours

Extension to 3D:

“Balloons”

Institute of Electrical Measurement and Measurement Signal Processing

51

Axel Pinz WS 2017/18 Image and Video Understanding 4

“Shape Priors” in Level Set Segmentation[Fussenegger]

… global deformation (scale, rot, translation)

Institute of Electrical Measurement and Measurement Signal Processing

52

Axel Pinz WS 2017/18 Image and Video Understanding 4

“Active Shape Models” in Level Set Segmentation[Fussenegger]

• Shape changes dependent on viewpoint train ASM

• Videos level_set_hide+seek level_set_teapot

Institute of Electrical Measurement and Measurement Signal Processing

53

Axel Pinz WS 2017/18 Image and Video Understanding 4

“Active Shape Models” in Level Set Segmentation[Fussenegger]

3 ASMs

learnt:

- elephant

- octopus

- african man

a original img

b segment.

c, d:

varying the

order of the

3 ASMs

Institute of Electrical Measurement and Measurement Signal Processing

54

Axel Pinz WS 2017/18 Image and Video Understanding 4

Summary IVU_1 – IVU_4

• Vision• Neurophysiology

• Cognitive psychology

• Computational theory (Marr paradigm, representations, algorithms)

• Linear Filtering, Convolution

• Definition of terms, system model of image understanding• Visual recognition the “holy grail” of computer vision

• Segmentation and grouping 2D image/scene description

… recap %

Institute of Electrical Measurement and Measurement Signal Processing

55

Axel Pinz WS 2017/18 Image and Video Understanding 4

Definition: Visual Recognition [Perona’09]

“The holy grail of Computer Vision”

Five tasks of “visual recognition”:

– Verification (is a “car” in the image?)

– Detection and localization (what is there? where?)

– Classification (n “beach” images, m “city” images)

– Naming (name and locate all objects in an image)

– Description: objects, actions, relations, etc.

(example “kissing” “scene understanding”)

Increasing complexity from top bottom

Image and Video Understanding: mostly 2D (+time) recognition

Image-based Measurement: 3D (+time) reconstruction

Co

mp

lexity

Institute of Electrical Measurement and Measurement Signal Processing

56

Axel Pinz WS 2017/18 Image and Video Understanding 4

My Model of Image Understanding

Institute of Electrical Measurement and Measurement Signal Processing

57

Axel Pinz WS 2017/18 Image and Video Understanding 4

2D Scene Representation and Description

You can get very far in 2D !

2D “image object”

“token”

“tokenset”

2D scene description

image

image

description

segmentation

2D grouping

What next?

Institute of Electrical Measurement and Measurement Signal Processing

58

Axel Pinz WS 2017/18 Image and Video Understanding 4

Course Schedule 2016/17• Vision

• Neurophysiology

• Cognitive psychology

• Computational theory (Marr paradigm, representations, algorithms)

• Linear Filtering, Convolution

• Definition of terms, system model of image understanding

• Visual recognition the “holy grail” of computer vision

• Segmentation and grouping 2D image/scene description

• Object categorization

• Terms, goals, issues, …

• Signal processing: Fourier, Gabor

• Scale

• Object models

• CNNs for image and video understanding

Recommended