58
Institute of Electrical Measurement and Measurement Signal Processing 1 Axel Pinz WS 2017/18 Image and Video Understanding 4 2D Scene Representation and Description You can get very far in 2D ! 2D “image object” “token” tokenset2D scene description image image description segmentation 2D grouping

2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

1

Axel Pinz WS 2017/18 Image and Video Understanding 4

2D Scene Representation and Description

You can get very far in 2D !

2D “image object”

“token”

“tokenset”

2D scene description

image

image

description

segmentation

2D grouping

Page 2: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

2

Axel Pinz WS 2017/18 Image and Video Understanding 4

Segmentation: From Images to Tokens

• “Formal” definition of segmentation:Image I is segmented segmentation S = {regions Ri | rules 1-4}

1. ∪ 𝑅𝑖 = 𝐼

2. 𝑖 ≠ 𝑗: 𝑅𝑖 ∩ 𝑅𝑗 = ∅

3. some homogeneity criterion H holds: ∀𝑖: 𝐻 𝑅𝑖 = 𝑡𝑟𝑢𝑒

4. disjunct neighbors: 𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑟 𝑅𝑖 , 𝑅𝑗 : 𝐻 𝑅𝑖 ∪ 𝑅𝑗 = 𝑓𝑎𝑙𝑠𝑒

• Fundamental concepts of segmentation• Region-based segmentation (e.g. threshold, split+merge, R growing)

• Contour-based segmentation (e.g. edge detection closing of gaps)

• State-of-the-art segmentation algorithms• Graph based segmentation

• Level set

image

image

description

segmentation

2D grouping

Page 3: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

3

Axel Pinz WS 2017/18 Image and Video Understanding 4

Segmenting 2D “Shapes”

Region-Based Segmentation (1)

e.g.:

Global threshold

Page 4: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

4

Axel Pinz WS 2017/18 Image and Video Understanding 4

Region-Based Segmentation (2)

e.g.: Region growing Seed cells

Page 5: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

5

Axel Pinz WS 2017/18 Image and Video Understanding 4

Region-Based Segmentation (3)

e.g.: Split and Merge

Image

Page 6: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

6

Axel Pinz WS 2017/18 Image and Video Understanding 4

Edge-Based Segmentation (1)

General approach:

1) Smoothing

2) Edge detection

3) Filteringa) Eliminate short edges

b) Close small gaps

4) Obtain closed contours

Image Edge image

Page 7: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

7

Axel Pinz WS 2017/18 Image and Video Understanding 4

Edge-Based Segmentation (2)

Real edge profile Real line profile

Edge: inflection point

Line: local extremum

Locate it:

Zero crossing of 1st / 2nd derivative !!

Noise smoothing needed convolution with a Gaussian!

Edge

Line

Page 8: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

8

Axel Pinz WS 2017/18 Image and Video Understanding 4

Edge-Based Segmentation (3)

D. Marr + E. Hildreth: LoG / DoG zero crossings [1978]

“Mexican hat” / “Sombrero” Operator

http://laurent-duval.blogspot.co.at/2014/09/cours-radial-basis-functions.html

∆ 𝐺 ∗ 𝐼 = (∆𝐺) ∗ 𝐼

Page 9: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

9

Axel Pinz WS 2017/18 Image and Video Understanding 4

Edge-Based Segmentation (4)

LoG / DoG zero crossings

always produce closed contours!

Original images

T … threshold

T … threshold

Canny edge

Detector

Page 10: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

10

Axel Pinz WS 2017/18 Image and Video Understanding 4

Relaxing the formal definition: Superpixels

Compactness and/or size parameters control the segmentation

Homogeneity may be violated:

disjunct neighbors: 𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑟 𝑅𝑖 , 𝑅𝑗 : 𝐻 𝑅𝑖 ∪ 𝑅𝑗 = 𝑓𝑎𝑙𝑠𝑒… may be true!

Example: [Achanta et al., SLIC superpixels, PAMI 34(11):2274-2282, 2012]

Page 11: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

11

Axel Pinz WS 2017/18 Image and Video Understanding 4

SLIC Examples [Achanta et al. PAMI ]

Spatial segmentations

into “superpixels”

Page 12: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

12

Axel Pinz WS 2017/18 Image and Video Understanding 4

SLIC Examples [Achanta et al. PAMI ]

Spatio-temporal

segmentations

into “supervoxels”

Page 13: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

13

Axel Pinz WS 2017/18 Image and Video Understanding 4

Temporal Superpixels TSPs [Chang et al CVPR‘13]

Page 14: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

14

Axel Pinz WS 2017/18 Image and Video Understanding 4

High-Level Segmentation – The “Correct” one?

3 Regions24 Regions41 Regions

http://www.cs.berkeley.edu/projects/vision/grouping/segbench/

Page 15: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

15

Axel Pinz WS 2017/18 Image and Video Understanding 4

State-of-the-Art Segmentation (1)

Graph Cut [Shi+Malik, PAMI 2000]

J. Shi, J. Malik, “Normalized Cuts and Image Segmentation”, PAMI 22(8):888-905, 2000

Fully connected graph (all pixels of the image !)

Weights wpq measure similarity between pixel p, q

Page 16: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

16

Axel Pinz WS 2017/18 Image and Video Understanding 4

Graph cut (2)J. Shi, J. Malik, “Normalized Cuts and Image Segmentation”, PAMI 22(8):888-905, 2000

Partition the graph into subgraphs

Goal: High similarity within a subgraph

Low similarity between subgraphs

Page 17: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

17

Axel Pinz WS 2017/18 Image and Video Understanding 4

N-Cut Results (1)

Page 18: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

18

Axel Pinz WS 2017/18 Image and Video Understanding 4

N-Cut Results (2)

Page 19: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

19

Axel Pinz WS 2017/18 Image and Video Understanding 4

State-of-the-Art Segmentation (2)

Graph Based [Felzenszwalb+Huttenlocher, IJCV 2004]P. Felzenszwalb, D. Huttenlocher, “Efficient, Graph-based Image Segmentation”, IJCV 59(2), 2004

Pairwise comparison of regions

Original image segmentation using segmentation using

“grid graph” “nearest neighbor graph”

Page 20: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

20

Axel Pinz WS 2017/18 Image and Video Understanding 4

Graph Based (2) [Felzenszwalb+Huttenlocher, IJCV 2004]

P. Felzenszwalb, D. Huttenlocher, “Efficient, Graph-based Image Segmentation”, IJCV 59(2), 2004

Pairwise comparison of regions

Original image segmentation using segmentation using

“grid graph” “nearest neighbor graph”

Page 21: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

21

Axel Pinz WS 2017/18 Image and Video Understanding 4

State-of-the-Art Segmentation (3)

Level-Set Based SegmentationJames Sethian (1999): Level Set & Fast Marching Methods, Cambridge.

Stan Osher & Ronald Fedkiw (2002): Level Set Methods and Dynamic Implicit Surfaces, Springer.

Stan Osher & Nikos Paragios (2003): Geometric Level Set in Imaging, Vision+Graphics, Springer.

Split the image into 2 regions, such that:

• Similarity between pixels of a region is maximized

• Contour length is minimized

e.g.: [Chan&Vese, 2001]

dH

dHufHufuuE

)(

)(1)()()(,, 22

Page 22: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

22

Axel Pinz WS 2017/18 Image and Video Understanding 4

Level-Set Based Segmentation (2)

dH

dHufHufuuE

)(

)(1)()()(,, 22

u+,u- Average intensities fore- and background

Φ Level-Set function

Ω image domain

f image point (pixel)

H Heaviside step function

ν weight of regularisation term (contour)

contour

Minimize E w.r.t. , u+, u-, +, -

https://en.wikipedia.org/wiki/Heaviside_step_function

Page 23: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

23

Axel Pinz WS 2017/18 Image and Video Understanding 4

Level-Set Based Segmentation (3)

Φ>0

Φ<0

Φ>0

Φ<0

Φ>0

Φ<0

Many adaptations, various formulations

e.g. [Paragios&Deriche], [Brox&Weickert], [Fussenegger]

Page 24: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

24

Axel Pinz WS 2017/18 Image and Video Understanding 4

Level-Set Based Segmentation (4)

Multi-region Level-Set

Page 25: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

25

Axel Pinz WS 2017/18 Image and Video Understanding 4

Tokenset: Relaxing the formal definition

1. ∪ 𝑅𝑖 = 𝐼

2. 𝑖 ≠ 𝑗: 𝑅𝑖 ∩ 𝑅𝑗 = ∅

3. some homogeneity criterion H holds: ∀𝑖: 𝐻 𝑅𝑖 = 𝑡𝑟𝑢𝑒

4. disjunct neighbors: 𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑟 𝑅𝑖 , 𝑅𝑗 : 𝐻 𝑅𝑖 ∪ 𝑅𝑗 = 𝑓𝑎𝑙𝑠𝑒

• Tokens may be post-processed (e.g. opening/closing)• #of holes, #of parts several regions!

• Tokens may be overlapping (at least: their bounding boxes)

• No need to cover the whole image!

image

image

description

segmentation

2D grouping

Page 26: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

26

Axel Pinz WS 2017/18 Image and Video Understanding 4

Token – Image Object

• Points (x,y)

• Lines ((x1,y1),(x2,y2))

• Polylines/Chains ((x1,y1),(x2,y2), …,(xn,yn))

• Polygons ((x1,y1),(x2,y2), …,(xn,yn),(x1,y1))Squares, rectangles, circles, ellipses, … (parametrized closed contours)

• Constellations/Bitmaps

“Feret”-Box

Bounding box, aligned (x,y)

… foreground

… background

Page 27: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

27

Axel Pinz WS 2017/18 Image and Video Understanding 4

Constellation Tokens

Page 28: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

28

Axel Pinz WS 2017/18 Image and Video Understanding 4

Tokenset

A file – i.e., a list of tokens database, various indexingTo

ke

n T

yp

e

Lexicon

Data

Page 29: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

29

Axel Pinz WS 2017/18 Image and Video Understanding 4

Tokenset 2D Image/Scene Description

“houses” [Matsuyama’90] “face” [Brunelli’92] “pedestrians” [Suzuki’90]

Page 30: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

30

Axel Pinz WS 2017/18 Image and Video Understanding 4

Example: PhD Pinz (1988)

Finding trees in aerial images …

Original image Smoothed (conv. Gaussian) Local brightness maxima

Circles

circles trees

Spruce (Fichte) + Pine (Kiefer)

Page 31: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

31

Axel Pinz WS 2017/18 Image and Video Understanding 4

Example: PhD Pinz (1988)

Finding trees in aerial images …

Original image Smoothed (conv. Gaussian) Local brightness maxima

Circles

circles trees

Spruce (Fichte) + Pine (Kiefer)

2D image 2D scene

description

image

image

description

image proc.

segmentation

2D grouping

2D scene

description

Page 32: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

32

Axel Pinz WS 2017/18 Image and Video Understanding 4

2D Scene Representation and Description

You can get very far in 2D !

2D “image object”

“token”

“tokenset”

2D scene description

image

image

description

segmentation

2D grouping

Page 33: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

33

Axel Pinz WS 2017/18 Image and Video Understanding 4

Some very fundamental questions:

• What characterizes an object ?

“objectness”

- compact, (self-)similar, distinct (color, texture), …

• Given an object, what characterizes its shape ?

• Maybe easier to answer: What is not shape?

- color

- texture

- size

?

Page 34: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

34

Axel Pinz WS 2017/18 Image and Video Understanding 4

“Objectness”

[Alexe et al., PAMI 2012]

cf. “Region Proposal Network – RPN” as one component in ConvNets for Object

Detection (e.g., Faster R-CNN, CVPR’14, see https://arxiv.org/abs/1506.01497)

Page 35: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

35

Axel Pinz WS 2017/18 Image and Video Understanding 4

Object ShapeHumans (again!) are very good

in characterizing shape!

Proportions !

Pablo Picasso, rites of spring,

from D.Marr, VISION, fig. 3-56 (a)

2D 3D

Page 36: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

36

Axel Pinz WS 2017/18 Image and Video Understanding 4

Ideas open to further development

• Spatio-temporal shape (2D space + time)

• Motion patterns, trajectory space, …

• Spatio-temporal shape (3D space + time)

Page 37: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

37

Axel Pinz WS 2017/18 Image and Video Understanding 4

Token Features for 2D Grouping

Shape• Minimum bounding rectangle (MBR)

• Best ellipse fit

• Aspect ratio (AR): |log(height/width)| = |log(width/height)|

• BR fill: % of foreground pixels in Feret box or in MBR

• Circumference

• Compactness = 𝑎𝑟𝑒𝑎

𝑐𝑖𝑟𝑐𝑢𝑚𝑓𝑒𝑟𝑒𝑛𝑐𝑒2

• Elongatedness = 1 −𝑚𝑖𝑛𝑜𝑟 𝑎𝑥𝑖𝑠

𝑚𝑎𝑗𝑜𝑟 𝑎𝑥𝑖𝑠of the best ellipse fit

Appearance• Color

• Texture

• Histograms of … local descriptors

Page 38: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

38

Axel Pinz WS 2017/18 Image and Video Understanding 4

Homographies, Collineations, Perspective Transformations

vs.

Token shape, appearance, etc.

I borrow from „Bildgestützte Messverfahren“

(image-based measurement 2VO, 1LU) …

A hierarchy of transformations – hierarchy of geometries:

• Euclidean

• Similarity transformation

• Affine transformation

• Perspective transformation

3

2

1

333231

232221

131211

3

2

1

'

'

'

'

x

x

x

hhh

hhh

hhh

x

x

x

x

x

H

Page 39: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

39

Axel Pinz WS 2017/18 Image and Video Understanding 4

A Hierarchy of Transformations / Geometries

100

2221

1211

y

x

taa

taa

100

2221

1211

y

x

tsrsr

tsrsr

333231

232221

131211

hhh

hhh

hhh

100

2221

1211

y

x

trr

trr

Projective

8dof

P

Affine

6dof

A

Similarity

4dof

S

Euclidean

3dof

E

In 2D, a square transforms to:

3

2

1

333231

232221

131211

3

2

1

'

'

'

'

x

x

x

hhh

hhh

hhh

x

x

x

x

x

H

Invariance of token features?

Page 40: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

40

Axel Pinz WS 2017/18 Image and Video Understanding 4

Invariance of Token FeaturesRadiometric

transformation

E

Euclidean

S

Similarity

A

Affine

P

Projective

Feret box: AR

Feret: BR fill

MBR: AR

MBR fill

best ellipse

elongatedness

circumference

compactness

size (# pix.)

color

texture

# holes

# parts

Page 41: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

41

Axel Pinz WS 2017/18 Image and Video Understanding 4

Examples 2D GroupingLarge tokens:

Original image 434 constellation tokens 9 tokens > 2000 pixels

Parallelism

Original image 26002 straight lines 94 lines with orientation -4

Page 42: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

42

Axel Pinz WS 2017/18 Image and Video Understanding 4

Grouping – Example (KU/09)

edges circles (Hough)

coins

Page 43: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

43

Axel Pinz WS 2017/18 Image and Video Understanding 4

Perceptual Grouping (Lowe’87)

input image projection of

3D wireframe model

successful matches

by perceptual grouping

Page 44: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

44

Axel Pinz WS 2017/18 Image and Video Understanding 4

The Umass VISIONS SystemInterpretations of Massachusetts “road” scenes [Draper et al., IJCV 1989]

original image interpretation result interpretation key

Page 45: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

45

Axel Pinz WS 2017/18 Image and Video Understanding 4

The Umass VISIONS SystemInterpretations of Massachusetts “road” scenes [Draper et al., IJCV 1989]

original image interpretation result interpretation key

Page 46: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

46

Axel Pinz WS 2017/18 Image and Video Understanding 4

The Umass VISIONS SystemInterpretations of Massachusetts “house” scenes [Draper et al., IJCV 1989]

original image interpretation result interpretation key

Page 47: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

47

Axel Pinz WS 2017/18 Image and Video Understanding 4

Bottom-Up vs. Top-Down Grouping

2D Models

• Chains of “edgels”, “ridgels”

• Hough transform:

complex patterns local maxima

Image space “Hough” space, bins

• Active contour models

• Shape priors

• Active shape models

Page 48: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

48

Axel Pinz WS 2017/18 Image and Video Understanding 4

Bottom-Up vs. Top-Down Grouping

2D Models

Edges chains

Lines ridges

Lines valleys

Page 49: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

49

Axel Pinz WS 2017/18 Image and Video Understanding 4

“Snakes” – Active Contour Models

Kass, Witkin, Terzopoulos, 1st ICCV, London, 1987

Image energies (greylevel, gradient, …), mechanical model (spring)

Active contours adaptation to subjective contours

Page 50: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

50

Axel Pinz WS 2017/18 Image and Video Understanding 4

“Snakes” – Active Contour Models

Kass, Witkin, Terzopoulos, 1st ICCV, London, 1987

Tracking of moving contours

Extension to 3D:

“Balloons”

Page 51: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

51

Axel Pinz WS 2017/18 Image and Video Understanding 4

“Shape Priors” in Level Set Segmentation[Fussenegger]

… global deformation (scale, rot, translation)

Page 52: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

52

Axel Pinz WS 2017/18 Image and Video Understanding 4

“Active Shape Models” in Level Set Segmentation[Fussenegger]

• Shape changes dependent on viewpoint train ASM

• Videos level_set_hide+seek level_set_teapot

Page 53: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

53

Axel Pinz WS 2017/18 Image and Video Understanding 4

“Active Shape Models” in Level Set Segmentation[Fussenegger]

3 ASMs

learnt:

- elephant

- octopus

- african man

a original img

b segment.

c, d:

varying the

order of the

3 ASMs

Page 54: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

54

Axel Pinz WS 2017/18 Image and Video Understanding 4

Summary IVU_1 – IVU_4

• Vision• Neurophysiology

• Cognitive psychology

• Computational theory (Marr paradigm, representations, algorithms)

• Linear Filtering, Convolution

• Definition of terms, system model of image understanding• Visual recognition the “holy grail” of computer vision

• Segmentation and grouping 2D image/scene description

… recap %

Page 55: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

55

Axel Pinz WS 2017/18 Image and Video Understanding 4

Definition: Visual Recognition [Perona’09]

“The holy grail of Computer Vision”

Five tasks of “visual recognition”:

– Verification (is a “car” in the image?)

– Detection and localization (what is there? where?)

– Classification (n “beach” images, m “city” images)

– Naming (name and locate all objects in an image)

– Description: objects, actions, relations, etc.

(example “kissing” “scene understanding”)

Increasing complexity from top bottom

Image and Video Understanding: mostly 2D (+time) recognition

Image-based Measurement: 3D (+time) reconstruction

Co

mp

lexity

Page 56: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

56

Axel Pinz WS 2017/18 Image and Video Understanding 4

My Model of Image Understanding

Page 57: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

57

Axel Pinz WS 2017/18 Image and Video Understanding 4

2D Scene Representation and Description

You can get very far in 2D !

2D “image object”

“token”

“tokenset”

2D scene description

image

image

description

segmentation

2D grouping

What next?

Page 58: 2D Scene Representation and Description · Level-Set Based Segmentation James Sethian (1999): Level Set & Fast Marching Methods, Cambridge. Stan Osher & Ronald Fedkiw (2002): Level

Institute of Electrical Measurement and Measurement Signal Processing

58

Axel Pinz WS 2017/18 Image and Video Understanding 4

Course Schedule 2016/17• Vision

• Neurophysiology

• Cognitive psychology

• Computational theory (Marr paradigm, representations, algorithms)

• Linear Filtering, Convolution

• Definition of terms, system model of image understanding

• Visual recognition the “holy grail” of computer vision

• Segmentation and grouping 2D image/scene description

• Object categorization

• Terms, goals, issues, …

• Signal processing: Fourier, Gabor

• Scale

• Object models

• CNNs for image and video understanding