Binocular Stereo
Topics
•Principle•basic equation•epipolar line•features and strategies for matching
•Case study•Block matching•Relaxation •DP stereo
Basic principles
Binocular stereo
single image is ambiguous
A
another image taken from a different direction gives the unique 3D point
a’a”
Epipolar line
Epipolar plane
Epipolar line constraints
Corresponding points lie on the Epipolar lines
Epipolar line constratints
Base line
One image pointPossible line of sight
Epipoles
C1
C2
e1e2
• intersections of baseline with image planes• projection of the optical center in another image• the vanishing points of camera motion direction
Examples of epipolar lines
Examples of epipolar lines
Examples of epipolar lines
Rectification
•rectification
Terminology
A physical point
focal length
right image point
z
left image point
base line length
right image planeleft image plane
World coordinate systemleft image centerright image center
Pinhole Camera
Perspective Projection
Z
X
f
u
(X, Y, Z)
Image plane
X
Y
-Z
uv
(u, v)
f : focal length
View point
(Optical center)
Z
Yf
Z
Xfvu ,),(
Basic binocular stereo equation
z=-2df/(x”-x’)x”-x’: disparity2d : base line length
x” x’
-z
fd d
z
d + x
)("
"
dxz
fx
f
x
z
xd
d - x
)('
'
dxz
fx
f
x
z
xd
dz
fdxdx
z
fxx 2)('"
Features for matching
a. brightness
b. edges
c. edge intervals
d. interest points
10 11 1210 11 12
10 11 1210 11 1211 15 16
a. relaxation
b. coarse to fine
c. dynamic programming
local optimam local optimam
Strategies for matching
global optimam
),(),()(),,( 32321211321 xxfxxfxfxxxf
10 10 1010 5 1010 10 10
10 10 1010 5 1010 10 10
10 10 1010 10 1010 10 10
Classification of stereo methods
•Features for matching•brightness value•point•edge•region
•Strategies for matching•brute-force •coarse-to-fine•relaxation•dynamic programming
•Constraints for matching•epipolar lines•disparity limit•continuity•uniqueness
Case study
Block-Matching Stereo
1. method
b c
b
c
2. problema. trade-off of window size and resolutionb. dull peak
b c
lI
Cost Function
2),(),(minmin w
rldd
ydxIyxIFd
)),(()),((
)),(),,((minmin
ydxIVaryxIVar
ydxIyxICovFd
rl
rl
dd
(b) SSD (sum. of squared difference)
(a) SAD (sum. of absolute difference)
w
rldd
ydxIyxIFd ),(),(minmin
(c) Correlation
NearObject
left
d
Background
NearObject
rightBackground
lI rI
Moravec Stereo(`79)
navigation
Moravec “Visual mapping by a robot rover” Proc 6th IJCAI,pp.598-600 (1979)
Moravec’s cart
Slide stereo
Motion stereo
Slider stereo (9 eyes stereo)
9C2 = 36 stereo pairs!!! each stereo has an uncertainty measure uncertainty = 1 / base-line
each stereo has a confidence measure
22
2
ba
ab
long base line
large uncertainty
Coarse to fine
expand
expand
matching
matching
matching
σ
estimated distance
σ:uncertainty measure
area:confidence measure
9C2 = 36 curves
Interest point
1. Features for matchinga. brightness valueb. pointc. edged. region
2. Strategies for matchinga. brute-force (not a strategy ???)b. coarse-to-finec. relaxationd. dynamic programming
3. Constraints for matchinga. epipolar linesb. disparity limitc. continuityd. Uniqueness
Purpose: navigation (Stanford)
Moravec Stereo(`81)
interest point
lINearObject
Recent Progress
left
disparity
Background
How to estimate the disparities? How to estimate the disparities?
disparity
NearObject
rightBackground
rI
),( rl IIF
),( rl IIF
Minimize some cost function along the epipolar line
H. Hirschnuller, "Improvements in Real-Time Correlation-Based Stereo Vision",IEEE Workshop on Stereo and Multi-Baseline Vision, 2001
Fatting Effect on Object Boundary
NearObject
left
Background
NearObject
right
Background
lI rI
Background Correspondence
Foreground Correspondence
No single window fits at the discontinuity ⇒ Fatting effect of the object
lINearObject
Background
Accurate Estimation on Object Boundary
Shiftable Window
left
NearObject
rightBackground
'''0minarg cccd
disparity),2,2(4
),2,2(3
),2,2(2
),2,2(1
),,(0
dhywxSADc
dhywxSADc
dhywxSADc
dhywxSADc
dyxSADc
''
'
c
c Min{c1,c2,c3,c4}
Min{{c1,c2,c3,c4}-c’}
c0
c1 c2
c3 c4
d
c0
c1
c3
are used.
Consistency Checking
Check if two independent disparity estimation coincide– Left Right search⇒– Right Left search⇒
Inconsistent disparities are considered as a false match
Epipolar line
left right
Epipolar line1st search
2nd search
Check if they coincide
Result
SAD with 11x11 window Shiftable window + consistency checking
Disparity mapLeft image
Cooperative stereo: Marr-Poggio Stereo(`76)
Simulating human visual system(random dot stereo gram)
Marr,Poggio “Cooperative computation of stereo disparity” Science 194,283-287
Input : random dot stereo
left image
random dot
shift the catch pat
right image
we can see the height different between the central and peripheral area
Constraints– Epipolar line constraint
– Uniqueness constraint» each point in a image has only one depth value
O.K. No.
– Continuity constraint» each point is almost sure to have a depth value near the values o
f neighbors
O.K. No.
Uniqueness constraint prohibits two or more matching points on one horizontal or vertical lines
continuity constraint attracts more matching on a diagonal line
ABC
D E F
D E F
A
B
C
A
B
C
(E-A)
(E-B)
(E-C)
prohibit
attract
attract
(D-A)
(E-B)
(F-C)Same depth
n n+1
Relaxation
10 10 1010 5 1010 10 10
10 10 1010 5 1010 10 10
10 10 1010 10 1010 10 10
),( jicn
)1,( jicn
),(1 jicn
)1,( jicn
),1( jicn
),1( jicn
Pr
''''1
''''
),(),(),(ji
nExji
nn jicjicjic
),(0 jic ),(1 jic ),(1 jicn
1. Features for matchinga. brightness valueb. pointc. edged. region
2. Strategies for matchinga. brute-force b. coarse-to-finec. relaxationd. dynamic programming
3. Constraints for matchinga. epipolar linesb. disparity limitc. continuityd. uniqueness
Purpose: simulate the human visual system (MIT)
Marr-Poggio Stereo (`76)
Recent progress:Graph-cut
Solve graph partition problem in globally optimal way1. Formulate the problem in energy minimization framework
2. Design a graph such that the sum of cut edges equals to the total energy
3. Find a “Cut” that minimizes the energy
ij
Vij
Ce
jiC
ji
VE,
,min
Cut
Example of Graph-cut
Image segmentation
Graph={N,e}Ni: Graph nodeeij: Edge connecting nodesC: Cut
Image={pixel}Vij: Similarity between neighboring pixelsForeground/Background boundary
Ce
jiC
ji
VE}{
,
,
min
Ni Nj
Vij
Graph partitionGraph partition SegmentationSegmentation
Solution to a Graph-cut Problem Min-Cut/Max-Flow algorithm
1. Given source (s) and sink nodes (t)2. Define capacity on each edge3. Find the maximum flow from s t, satisfying capacit⇒
y constraints, and cut the bottleneck
Yuri Boykov, Vladimir Kolmogorov, "An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision", PAMI, 2004
ijSource Sink
jiji VC ,,
Min-Cut = Max-FlowMin-Cut = Max-Flow
Bottleneck
Flow
Multi-label Problem
Find the labeling f that minimizes the energy
)()()( fEfEfE datasmooth
)( fEdata
)( fEsmooth measures the extent to which f is not piece wise smooth
measures the disagreement between f and the observed data
)(),()(},{
,
Pp
ppNqp
qpqp fDffVfE
pD Measures how well label fp fits pixel p given the observed data
qpV , Smoothness penalty between adjacent (N) pixels
Yuri Boykov and Olga Veksler and Ramin Zabih, “Fast Approximate Energy Minimization via Graph Cuts,” ICCV, 2001
Iterative graph-cut approach 2 types of move algorithm are proposed
– αβ-swap
– α-expansion
Minimize E under cond. is preserved
Minimize E under cond. can be changed to α
Multi-label Solution via Graph-cut
α
βγ
α
βγ
Graph-cut
Graph-cut
},{ rf
}{rf
αβ-swap Algorithm
1. Start with an arbitrary labeling f
2. Success:= 0
3. For each pair of labelsa. Find f’=arg min E(f’) among f’ within one αβ-swap of f
b. If E(f’) < E(f) then f’:=f and success:=1
4. If success=1 goto 2
5. Return f
α
βγ
L},{
αβ-swap Graph Structure
p q r s
α
β
pt
pt
pt
pt
qpe , qpe ,
edge weight for
PqNq
qp
p
fVD,
),()(
PqNq
qp
p
fVD,
),()(
),( V
Pp
Pqp
Nqp
,
},{
Pp
),( baV should be a semi metric
αβ-swap Cut
3 possible cases
p q
α
β
p q
α
β
p q
α
β
Cut
CutCut
pt
pt
qpe ,
qt
qt
pt
pt
qpe ,
qt
qt
pt
pt
qpe ,
qt
qt
α α β β β α
α-expansion Algorithm
1. Start with an arbitrary labeling f
2. Success:= 0
3. For each labela. Find f’=arg min E(f’) among f’ within one α-expansion of f
b. If E(f’) < E(f) then f’:=f and success:=1
4. If success=1 goto 2
5. Return f
α
βγ
L
α-expansion Graph Structure
p q r s
αpt
pt
ape ,
pt
pt
qpe ,
edge weight for
),( pfV
Pp
qp ff
Nqp
},{
),( baV should be a metric
qt
qt
Auxiliary nodes are added at the boundaryof sets P where
a b
qae ,
at
qp ff
pt )( pp fD Pppt )(pD Pp
ape , ),( pfV
qp ff
Nqp
},{
qae , ),( qfV
),( qp ffV
α-expansion Cut
3 possible cases
p q
α
Cut
pt
qt
a
pt
qt
at
p q
α
Cut
pt
qt
a
pt
qt
at
p q
α
Cut
pt
qt
a
pt
qt
at
),( pfV ),( qfV
),( qq ffV
Because V(a,b) is a metricV(a,b) < V(a,c)+V(c,b)
Never happens!
α α α
Pixel-based Stereo Matching via Graph-cut
left right
)()()( fEfEfE smoothdata
2
2
)(
),(),(
pNqqpsmooth
imageypxryxldata
ffE
pfpIppIE
Image
1 2 L
minimizing the cost function by iterative graph cut (α-expansion)
Label:
Labeling f means disparity assignment
pf
p
Graph-cut with Occlusions
Occluded pixels are handled explicitly in the graph Find a subset of A
Find a configuration f such that
V. Kolmogorov and R. Zabih, “Computing visual correspondence with occlusions via graph cuts,” ICCV, 2001
}0 and |),{( kpqqpqpA xxyy
left right
p q
0
1
a
a
f
fAa if the pixels (p,q) correspond
otherwise
for 0 if Aq)(p,afa p is an occluded pixel
Energy Function to Minimize
)()()()( fEfEfEfE smoothocclusiondata
2)),(),(()( yxryxldata pdpIppIfE
Pp
PPocclusion fNTCfE )0|)((|)(
))2()1(()(}2,1{
2,1
Naa
aasmooth afafTVfE
Occlusion penalty
otherwise
|))()(||,)()((|max if 32,1
sIqIrIpIV rrll
aa
Number of pixels paired with p
T(a) is 1 if a=true, otherwise 0
Minimization Flow
p q r s
w x y z
left
right
disparity 0disparity 1
disparity α (=2)
Current assignment
Possible assignment after α-expansion
α
<p,w>
<p,y>
<q,y>
<q,z>
<r,z>
Choose α
Partition via Graph-cut (α-expansion)
•Construct a Graph•Assign weight to each edge
Result
Left Image Ground Truth
Graph-cut with Occ.Graph-cut without Occ.L1 correlation
DP stereoOhta-Kanade Stereo(`85)
Map making
Ohta,Kanade “Stereo by intra- and inter-scanline search using dynamic programming” ,IEEE Trans.,Vol. PAMI-7,No.2,pp.139-14
now matching become 1D to 1D
yet, N line * ML * MR (512 * 100 * 100 * 10 m sec = 15 hours)
L1L2L3L4L5L6
R1R2R3R4R5R6
L
R
disparity
Path Search
Matching problem can be considered as a path search problem
define a cost at each candidate of path segment based some ad-hoc function
10 100 100
Dynamic programming
We can formalize the path finding problem as the following iterative formula
optimum cost to K
cost between M and K
)();(min)(}{
kDkMdMDk
)1()1;0(),2()2;0(),3()3;0(min)0( DdDdDdD
3 0
2 1
Optimum costs are known
stereo pair
edges
path disparity
depth
stereo pair
edges
depth
1. Features for matchinga. brightness valueb. pointc. edged. region
2. Strategies for matchinga. brute-forceb. coarse-to-finec. relaxationd. dynamic programming
3. Constraints for matchinga. epipolar linesb. disparity limitc. continuityd. uniquenessaerial image analysis (CMU)
Ohta-Kanade Stereo(`85)
Brightness of interval
Recent progress:4-move, 4-plane DP
Occluded pixels are handled explicitly in 4-move, 4-plane representation
Disparity map is calculated under DP Matching (global energy minimization)
A. Criminisi; J. Shotton; A. Blake; C. Rother; P.H.S. Torr, “Efficient Dense-Stereo and Novel-view Synthesis for Gaze Manipulation in One-to-one Teleconferencing,” MSR-TR-2003-59, 2003
Occluded path and visible path cannot bedistinguished in this representation
occluded move
Conventional 3-move DP
Right
Le
ftoccluded m
ove
Matc
hed m
ove
Left scan-line
Right scan-
line
True matching path
Approximated matching path
ProblemProblem
Visible only from right
Visible from both left and right
4-move DP
Occluded path and visible path are handled separately
Occluded move (r)
Matched m
ove(l)
Left scanline
Right scanlin
e
True matching path
Approximated matching path
Occluded m
ove(l)Matched move (r)
Design of Move Transition
Lm
Lo
Ro
Rm
),( rlM
),( rlM ),( rlM
),( rlM),( rlM ),( rlM
),( rlM
),( rlM
22
2
2
1),(
rpr
rpr
lpl
lpl
rpr
rpr
lpl
lpl
IIII
IIIIrlM
Normalized Sum of Squared Difference
Move TransitionMove Transition
Matching CostMatching Cost
Lm Left Matched Move
Right Occluded MoveRo
Rm
Lo Left Occluded Move
Right Matched Move
4-move, 4-plane DP
Each node should hold 4 accumulation costs separately for each move
⇒ 4-plane model
Inter-scanline Consistency
Propagate information across scan-lines ⇒ Gaussian filter is applied on Matching cost array
Without Gaussian filter With Gaussian filter
Left Occluded PixelsRight Occluded Pixels
Result
Left Occluded PixelsRight Occluded Pixels
Input ImagesInput Images 3-move DP3-move DP 4-move, 4-plane DP4-move, 4-plane DP
Comparison in Severe Situation
Graph-cut with occlusions 4-move, 4-plane DPBlock Matching
left right
InputInput
ResultResult Occluded PixelsTexture-less region
Summary1. Two images from two different positions give depth information
2. Epipolar line and plane
3. Basic equation Z=-2df/(x”-x’) x”-x’: disparity 2d : base line length
4. case study Block-matching-based stereo Cooperative stereo DP based stereo
Recent directions1. Block Matching (Local optimization)
– Fast for real-time applications (parallel processing, SIMD)– Not accurate in texture-less region
2. Graph-cut variants (Global optimization) globally consistent disparity map is obtained texture-less region is interpolated nicely
3. 4. 4-move, 4-plane, DP Matching (Global optimization) globally consistent disparity map is obtained texture-less region is interpolated nicely Inter scan-line inconsistency is reduced, but yet to be seen
4. Belief Propagation