2
Creating VR Scenes Using Fully Automatic Derivation of Motion Vectors Kensuke Habuka and Yoshihisa Shinagawa University of Illinois at Urbana-Champaign 1 Introduction We propose a new method to create smooth VR scenes using a limited number of images and the motion vectors among them. We will discuss two specific components to simulate a majority of VR scenes: MV VR Object and MV VR Panorama. They provide similar functions to QuickTime VR Object and QuickTime VR Panorama [1]. However, our method can interpolate between the existing images, and therefore, smooth movement of viewpoints is achieved. When we look at a primitive from arbi- trary viewpoints, the images of an object associated with the primitive is transformed according to the motion vectors and to the location of the viewpoint. 2 Motion Vectors and Generat- ing Intermediate Images First, let us consider two images. The first one is called the source image and the second the destina- tion image. Our matching technique solve the cor- respondence problem between pixels in the source image (x, y) and pixels in the destination image (x dst ,y dst ). We call v f (x, y)=(x dst ,y dst ) (x, y) the forward motion vectors that are the motion vec- tors from the source image to the destination image. Our method also solve for the backward motion vec- tors v b (x, y) from the destination image to the source image. The solution for v b is not trivial because of discretization of the domain and occlusions in the two images. In this paper, we use the correspondences com- puted by critical-point filters [2] as the motion vec- tors. The multiresolutional critical-point filters are nonlinear filters and they do not destroy the essential structures of the critical points in the images. There are four types of the critical-point filters: α x α y p (m) (x,y) , α x β y p (m) (x,y) , β x α y p (m) (x,y) , and β x β y p (m) (x,y) that preserve stronger critical points such as the peaks, saddle points, and pits. The size of the resulting images is 1/4 of the original. The abbreviations above mean: α x I (m1/2) (x,y) = max 0w2 I (m) (x,y) (2x + w, y), α y I (m1/2) (x,y) = max 0w2 I (m) (x,y) (x, 2y + w), β x I (m1/2) (x,y) = min 0w2 I (m) (x,y) (2x + w, y), β x I (m1/2) (x,y) = min 0w2 I (m) (x,y) (x, 2y + w) where I (m) (x,y) is the intensity of the pixel at (x, y) at the m-th resolution. Then, for each pixel at (x, y) of the source image at each resolution level, the correspond- ing pixel at (x dst ,y dst ) of the destination image is de- termined so that the associated energy is minimum. The candidate is searched for inside the quadrilateral determined at the upper resolution level. Once we obtain the pair of motion vectors, we can generate the intermediate image between the source and destination images by interpolation. First, we transform both images using the motion vectors. Transforming an image I with motion vectors v means that each square (x, y), (x +1,y), (x +1,y + 1), (x, y + 1) is transformed to (x, y)+ v(x, y), (x + 1,y)+ v(x +1,y), (x +1,y +1)+ v(x +1,y +1), (x, y + 1) + v(x, y + 1). The color of each pixel in the trans- formed image is bilinearly interpolated. We denote it by transform(I,v). After both images are transformed, they are blended together using the following expression: f blend2 (1 τ )transform(I src ,τv f )+ f blend2 (τ )transform(I dst , (1 τ )v b ), where f blend2 is a blending function and 0 τ 1 is a blending ratio. τ is the relative position between the images. f blend2 (a)+ f blend2 (1 a) = 1, where 0 a 1. We used f blend2 (τ )= 1 2 sin(πτ π 2 ) 1 2 . We can generate the intermediate image at an ar- bitrary point inside the square formed by four view- points and four pairs of motion vectors as well. We call it 2D interpolation. 3 Creating VR Scenes 3.1 MV VR Object 225 IEEE Virtual Reality 2004 March 27-31, Chicago, IL USA 0-7803-8415-6/04/$20.00©2004 IEEE. Proceedings of the 2004 Virtual Reality (VR’04) 1087-8270/04 $ 20.00 IEEE

[IEEE IEEE Virtual Reality 2004 - Chicago, IL, USA (27-31 March 2004)] IEEE Virtual Reality 2004 - Creating VR scenes using fully automatic derivation of motion vectors

  • Upload
    y

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: [IEEE IEEE Virtual Reality 2004 - Chicago, IL, USA (27-31 March 2004)] IEEE Virtual Reality 2004 - Creating VR scenes using fully automatic derivation of motion vectors

Creating VR Scenes Using Fully Automatic Derivation of Motion

Vectors

Kensuke Habuka and Yoshihisa ShinagawaUniversity of Illinois at Urbana-Champaign

1 Introduction

We propose a new method to create smooth VRscenes using a limited number of images and themotion vectors among them. We will discuss twospecific components to simulate a majority of VRscenes: MV VR Object and MV VR Panorama. Theyprovide similar functions to QuickTime VR Objectand QuickTime VR Panorama [1]. However, ourmethod can interpolate between the existing images,and therefore, smooth movement of viewpoints isachieved. When we look at a primitive from arbi-trary viewpoints, the images of an object associatedwith the primitive is transformed according to themotion vectors and to the location of the viewpoint.

2 Motion Vectors and Generat-ing Intermediate Images

First, let us consider two images. The first one iscalled the source image and the second the destina-tion image. Our matching technique solve the cor-respondence problem between pixels in the sourceimage (x, y) and pixels in the destination image(xdst, ydst). We call vf (x, y) = (xdst, ydst) − (x, y)the forward motion vectors that are the motion vec-tors from the source image to the destination image.Our method also solve for the backward motion vec-tors vb(x, y) from the destination image to the sourceimage. The solution for vb is not trivial because ofdiscretization of the domain and occlusions in the twoimages.

In this paper, we use the correspondences com-puted by critical-point filters [2] as the motion vec-tors. The multiresolutional critical-point filters arenonlinear filters and they do not destroy the essentialstructures of the critical points in the images. Thereare four types of the critical-point filters: αxαyp

(m)(x,y),

αxβyp(m)(x,y), βxαyp

(m)(x,y), and βxβyp

(m)(x,y) that preserve

stronger critical points such as the peaks, saddlepoints, and pits. The size of the resulting images is

1/4 of the original. The abbreviations above mean:αxI

(m−1/2)(x,y) = max0≤w≤2 I

(m)(x,y)(2x + w, y),

αyI(m−1/2)(x,y) = max0≤w≤2 I

(m)(x,y)(x, 2y + w),

βxI(m−1/2)(x,y) = min0≤w≤2 I

(m)(x,y)(2x + w, y),

βxI(m−1/2)(x,y) = min0≤w≤2 I

(m)(x,y)(x, 2y + w)

where I(m)(x,y) is the intensity of the pixel at (x, y) at the

m-th resolution. Then, for each pixel at (x, y) of thesource image at each resolution level, the correspond-ing pixel at (xdst, ydst) of the destination image is de-termined so that the associated energy is minimum.The candidate is searched for inside the quadrilateraldetermined at the upper resolution level.

Once we obtain the pair of motion vectors, we cangenerate the intermediate image between the sourceand destination images by interpolation. First, wetransform both images using the motion vectors.Transforming an image I with motion vectors vmeans that each square (x, y), (x + 1, y), (x + 1, y +1), (x, y + 1) is transformed to (x, y) + v(x, y), (x +1, y)+v(x+1, y), (x+1, y+1)+v(x+1, y+1), (x, y+1) + v(x, y + 1). The color of each pixel in the trans-formed image is bilinearly interpolated. We denote itby transform(I, v).

After both images are transformed, they areblended together using the following expression:

fblend2(1 − τ)transform(Isrc, τvf ) +fblend2(τ)transform(Idst, (1 − τ)vb),

where fblend2 is a blending function and 0 ≤ τ ≤ 1is a blending ratio. τ is the relative position betweenthe images. fblend2(a) + fblend2(1 − a) = 1, where0 ≤ a ≤ 1. We used fblend2(τ) = 1

2 sin(πτ − π2 ) − 1

2 .We can generate the intermediate image at an ar-

bitrary point inside the square formed by four view-points and four pairs of motion vectors as well. Wecall it 2D interpolation.

3 Creating VR Scenes

3.1 MV VR Object

225

IEEE Virtual Reality 2004 March 27-31, Chicago, IL USA 0-7803-8415-6/04/$20.00©2004 IEEE.

Proceedings of the 2004 Virtual Reality (VR’04) 1087-8270/04 $ 20.00 IEEE

Page 2: [IEEE IEEE Virtual Reality 2004 - Chicago, IL, USA (27-31 March 2004)] IEEE Virtual Reality 2004 - Creating VR scenes using fully automatic derivation of motion vectors

This section describes how to display arbitraryviews of an object using a limited number of images.Let us index the image of an object by the camera’sangles θ, φ and distance D as Iθ,φ,D in the following.

Suppose there are N images looking at the objectfrom viewpoints on the plane containing the object’scenter: I0,0,D0 , I∆θ,0,D0 , I2∆θ,0,D0 , ..., I(N−1)∆θ,0,D0 ,where D0 is a constant distance, and ∆θ = 360

N . isa constant angle. IN∆θ,0,D0 is equal to I0,0,D0 . Togenerate Iθ,0,D0 from the images, we use the methodthat we described in Section 2.

We select an integer i that satisfies i∆θ ≤ θ ≤(i + 1)∆θ and 0 ≤ i < N in order to select two im-ages from our existing images. Then, we generatethe intermediate image Iθ,0,D0 using Ii∆θ,0,D0 as thesource image and I(i+1)∆θ,0,D0 as the destination im-age with two sets of motion vectors between them atthe relative position τ = θ−i∆θ

∆θ .We synthesize Iθ,0,D which is the image at an ar-

bitrary distance D as follows. To obtain a ray pass-ing through the viewpoint P of Iθ,0,D and each im-age point Q of Iθ,0,D, we need to identify an in-terpolated image Iθ+θft,0,D0 containing the ray PQby calculating the angle θft. For real-time render-ing using meshes, we first need to obtain the offset(xft, yft) from the point (x, y) on the grid of mo-tion vectors. We cannot obtain the offset from theimage Iθ+θft,0,D0 because θft is unknown. There-fore, we use the offset (x′

ft, y′ft) = τvf (x, y)+(x, y)−

(xcenter, ycenter) in the image Iθ,0,D0 as an approxi-mation, where (xcenter, ycenter) is the center of theimage. Then, we approximate θ′ft(x, y) by θ′ft =

arctan(

(D−D0)x′ft

D0D

).

The image I ′θ,0,D can be obtained using v′ft =

τftvf (x, y). However, I ′θ,0,D is distorted with the dis-

tortion ratio αft = D cos(θft)D+xft sin(θft)

. The distortion ratioβft for the y axis is 1 because we have no images ormotion vectors along the depression angle φ. Finally,we can compute an approximate forward transform

vector: vft(x, y) =(

αft 00 βft

)τftvf (x, y). The

expressions to solve for an approximate backwardtransform vector vbt are almost the same, except forchanging vf to vb and τ to 1− τ . After vft(x, y) andvbt(x, y) are obtained, we can generate an approxi-mate image for Iθ,0,D using the following expression:

fblend2(1 − τ)transform(Ii∆θ,0,D0 , vft) +fblend2(τ)transform(I(i+1)∆θ,0,D0 , vbt).

After the image Iθ,0,D is obtained, the image iszoomed in or out according to its position. Figure1 shows the distortion effect caused by changing dis-tance D. For ease of understanding, it is shown with-

out zooming.

(a) (b) (c)

Figure 1: The original image from the distance D0

(b) and synthesized images from the distances 10D0

(a) and D05 (c) without zooming.

3.2 MV VR Panorama

In this section, we describe the method of seamlesspanorama scene interpolation. Using this method,users can view the scene in an arbitrary directionfrom an arbitrary position.

First, we need to take panoramic images at sev-eral locations. We used approximately twelve pic-tures to make a single panoramic image for each view-point. There are many stitching methods to makepanoramic images. We denote the panoramic imageat the position (px, py) by P(px,py).

To generate a dense set of panoramic imagesfrom the sparse set, 2D interpolation is used. Weassume that we have N × M panoramic imagesP0,0, P∆px,0, · · · , PN∆px,0, P0,∆py , · · · , P(N−1)∆px,∆py

,· · · , P(N−1)∆px,(M−1)∆py

, where ∆px and ∆py areconstant distances and N and M are integers. Togenerate an intermediate panoramic image Ppx,py , weselect integers i and j that satisfy i∆px ≤ px ≤(i + 1)∆px and j∆py ≤ py ≤ (j + 1)∆py in order todetermine the four panoramic images to be interpo-lated. The integers i and j must satisfy 0 ≤ i < N−1and 0 ≤ j < M − 1. Then we interpolate theimage Ppx,py

using the ratios τx = px−i∆px

∆pxand

τy = py−i∆py

∆py.

After panoramic images are synthesized, we needto correct the distortion caused by the camera whenmaking the panoramic images.

References

[1] S. E. Chen and L. Williams. View interpolationfor image synthesis. In Proc. SIGGRAPH ’93,pages 279–288, 1993.

[2] K. Habuka and Y. Shinagawa. Image interpola-tion using enhanced multiresolution critical-pointfilters. International Journal of Computer Vision,58(1), 2004.

226Proceedings of the 2004 Virtual Reality (VR’04) 1087-8270/04 $ 20.00 IEEE