16
INVITED PAPER Three-Dimensional Imaging for Creating Real-World-Like Environments The authors of this paper show how 3-D imaging will be able to create real-world-like environments. By Jung-Young Son , Wook-Ho Son , Sung-Kyu Kim, Kwang-Hoon Lee, and Bahram Javidi, Fellow IEEE ABSTRACT | Three-dimensional imaging is still not in the full commercial stage, but its application is widening due to its capability of creating real-world-like environments. This capa- bility is especially important in realizing reality communication and telepresence operations in medical and unreachable places. For these applications, technologies for interacting with objects in the 3-D image should also be developed. The widespread use of 3-D images is expected in near future, but before that, problems of minimizing physical stresses, espe- cially eye fatigue, should be solved for 3-D imaging. Currently, expected solutions to the problems are electroholography and super multiview methods. These two methods work at different principles, but they both can provide the continuous parallax as in the real-world scenes/objects. Electroholography can also provide focusable depth information to the viewers, though its viewing zone angle is limited. For the super multiview, it is not apparent that it can provide the focusable depth information, but it can provide more comfortable viewing condition than the holography because of its wider viewing zone angle. KEYWORDS | Electroholography; eye fatigue; focusable depth information; interactive technology; multiview and integral photography; super multiview image; real-world-like environ- ment; 3-D image I. INTRODUCTION Three-dimensional images are much more accurate and realistic than plane images due to the depth sense they can provide. This depth sense could induce a viewer to im- merse into the scene in the display panel. This immersive feeling can be stronger with the increasing size of displays. As the feeling becomes stronger, the scene in the display panel will look more real to the viewers, and then the viewers will get the feeling of being in the place which the 3-D image describes, i.e., immersive feeling/presence feeling. This immersive feeling makes possible to increase efficiencies in remote working such as medical and tele- operations with scenes as real as those which viewers per- ceive everyday in their surroundings. The accuracy and the immersive feeling are the main motive of demanding 3-D images in the areas of communication, broadcasting, medical operations, virtual world presentations, advertise- ment, training, edutainment (education+entertainment), telemarketing, telepresence, teleconference, visualization of experimental results, etc. The immersive feeling will be undisturbed and maximized if the display panel/screen can display 3-D images as natural and real as possible, and which do not cause any physical problems such as stress or strain to viewers’ bodies, especially to their eyes. The 3-D images on the display panel should provide a natural depth sense perceived from vergence and accommodation cues of viewers’ eyes along with parallaxes to generate undisturbed immersive feeling. In this sense, CAVE [1] and other panoramic view systems [2], [3], which require stereo-glasses Manuscript received July 15, 2011; revised October 18, 2011; accepted November 18, 2011. Date of publication February 3, 2012; date of current version December 14, 2012. This work was supported by the Korean Ministry of Culture, Sports and Tourism and the Korea Creative Content Agency (KOCCA) under the Culture Technology (CT) Research and Development Program 2011, and by the IT R&D program of MKE/KEIT (KI001810039169, Development of Core Technologies of Holographic 3D Video System for Acquisition and Reconstruction of 3D Information). J.-Y. Son is with the Department of Biomedical Engineering, Konyang University, Nosan, Chungnam 320-711, Korea (e-mail: [email protected]). W.-H. Son is with the Next Generation Visual Computing Research Team, Electronics and Communication Technology Research Institute, Daejeon 305-700, Korea (e-mail: [email protected]). S.-K. Kim and K.-H. Lee are with the Imaging Media Research Center, Korea Institute of Science and Technology, Seoul 136-791, Korea (e-mail: [email protected]) B. Javidi is with the Electrical and Computer Engineering Department, University of Connecticut, Storrs, CT 06269-2157 USA (e-mail: [email protected]). Digital Object Identifier: 10.1109/JPROC.2011.2178052 190 Proceedings of the IEEE | Vol. 101, No. 1, January 2013 0018-9219/$31.00 Ó2012 IEEE

Three-Dimensional Imaging for Creating Real-World-Like Environments

  • Upload
    bahram

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Three-Dimensional Imaging for Creating Real-World-Like Environments

INV ITEDP A P E R

Three-Dimensional Imagingfor Creating Real-World-LikeEnvironmentsThe authors of this paper show how 3-D imaging will be able to create

real-world-like environments.

By Jung-Young Son, Wook-Ho Son, Sung-Kyu Kim, Kwang-Hoon Lee, and

Bahram Javidi, Fellow IEEE

ABSTRACT | Three-dimensional imaging is still not in the full

commercial stage, but its application is widening due to its

capability of creating real-world-like environments. This capa-

bility is especially important in realizing reality communication

and telepresence operations in medical and unreachable

places. For these applications, technologies for interacting

with objects in the 3-D image should also be developed. The

widespread use of 3-D images is expected in near future, but

before that, problems of minimizing physical stresses, espe-

cially eye fatigue, should be solved for 3-D imaging. Currently,

expected solutions to the problems are electroholography and

super multiview methods. These two methods work at different

principles, but they both can provide the continuous parallax

as in the real-world scenes/objects. Electroholography can also

provide focusable depth information to the viewers, though its

viewing zone angle is limited. For the super multiview, it is not

apparent that it can provide the focusable depth information,

but it can provide more comfortable viewing condition than the

holography because of its wider viewing zone angle.

KEYWORDS | Electroholography; eye fatigue; focusable depth

information; interactive technology; multiview and integral

photography; super multiview image; real-world-like environ-

ment; 3-D image

I . INTRODUCTION

Three-dimensional images are much more accurate andrealistic than plane images due to the depth sense they can

provide. This depth sense could induce a viewer to im-

merse into the scene in the display panel. This immersive

feeling can be stronger with the increasing size of displays.

As the feeling becomes stronger, the scene in the display

panel will look more real to the viewers, and then the

viewers will get the feeling of being in the place which the

3-D image describes, i.e., immersive feeling/presencefeeling. This immersive feeling makes possible to increase

efficiencies in remote working such as medical and tele-

operations with scenes as real as those which viewers per-

ceive everyday in their surroundings. The accuracy and the

immersive feeling are the main motive of demanding 3-D

images in the areas of communication, broadcasting,

medical operations, virtual world presentations, advertise-

ment, training, edutainment (education+entertainment),telemarketing, telepresence, teleconference, visualization

of experimental results, etc. The immersive feeling will be

undisturbed and maximized if the display panel/screen can

display 3-D images as natural and real as possible, and

which do not cause any physical problems such as stress or

strain to viewers’ bodies, especially to their eyes. The 3-D

images on the display panel should provide a natural depth

sense perceived from vergence and accommodation cues ofviewers’ eyes along with parallaxes to generate undisturbed

immersive feeling. In this sense, CAVE [1] and other

panoramic view systems [2], [3], which require stereo-glasses

Manuscript received July 15, 2011; revised October 18, 2011; accepted

November 18, 2011. Date of publication February 3, 2012; date of current version

December 14, 2012. This work was supported by the Korean Ministry of Culture,

Sports and Tourism and the Korea Creative Content Agency (KOCCA) under the

Culture Technology (CT) Research and Development Program 2011, and by the IT R&D

program of MKE/KEIT (KI001810039169, Development of Core Technologies of

Holographic 3D Video System for Acquisition and Reconstruction of 3D Information).

J.-Y. Son is with the Department of Biomedical Engineering, Konyang University,

Nosan, Chungnam 320-711, Korea (e-mail: [email protected]).

W.-H. Son is with the Next Generation Visual Computing Research Team, Electronics

and Communication Technology Research Institute, Daejeon 305-700, Korea (e-mail:

[email protected]).

S.-K. Kim and K.-H. Lee are with the Imaging Media Research Center, Korea Institute

of Science and Technology, Seoul 136-791, Korea (e-mail: [email protected])

B. Javidi is with the Electrical and Computer Engineering Department, University of

Connecticut, Storrs, CT 06269-2157 USA (e-mail: [email protected]).

Digital Object Identifier: 10.1109/JPROC.2011.2178052

190 Proceedings of the IEEE | Vol. 101, No. 1, January 2013 0018-9219/$31.00 �2012 IEEE

Page 2: Three-Dimensional Imaging for Creating Real-World-Like Environments

for virtual reality experience, will not be appropriate increating the real-world-like environment, though they can

generate some degree of immersive feeling. Interaction with

the 3-D images displayed on the screen/display panel is

another way of maximizing the immersive feeling [4], [5]. In

this case, the images having real object sizes will be probably

more effective in creating real-world-like environments.

Displaying real size images will not be the problem of having

displays with sizes corresponding to the images, but thepuppet theater effect, which is induced by the size constancy

of human perception mechanism [6]. The solution to this

problem should be found to create the real-world-like

environment with the 3-D image in the future. Historically,

stereoscopic movies have had great impacts on the develop-

ments of 3-D imaging technologies. The stereoscopic movies

based on anaglyph in the 1930s and linearly polarized

polarizing glasses in the 1950s stimulated the development ofnoneyeglasses-type 3-D imaging methods [7]. Recently, the

success of stereoscopic movie Avatar [8] stimulated com-

mercialization of the 3-D TV based on circularly polarized

glasses and high-speed liquid crystal display (LCD) shutter

glasses [9]–[11], and lenticular plate [12]. However, the

massive market of the 3-D TV is still not foreseen. This is

probably because the quality of a 3-D image is in most cases

inferior to that of the plane image. The increasing size andthe resolution of flat panel displays are rapidly improving the

quality of plane images, and scenes from the displays become

more realistic. As a consequence, image quality standards of

the viewers’ eyes have been continuously upgraded. Since the

resolutions of flat panel displays are expected to grow to

super high definition (SHD) and then ultrahigh definition

(UHD), which have a 16 times higher resolution than the

current full HD (resolution 1920� 1080), in the future, it isforeseen that the quality standard of the viewers’ eyes will be

further upgraded. This will make the 3-D image replacement

of the plane image more difficult, unless flat panel displays

specialized for 3-D imaging are developed in near future or a

large number of projection units are used [13]. Current 3-D

images are mostly built on the flat panel displays for the plane

image. This makes the 3-D image not even struggle with

image quality deterioration due to low individual viewresolution but also moires, crosstalk, and low brightness [14].

This is especially true for the contact-type multiview imaging

methods such as the multiview (MV) [15] and the integral

photography (IP) [16], which need to divide the resolution of

the flat panel display into the desired number of different

view images to be displayed. In spite of all these odds, there is

no doubt that the 3-D image will dominate the plane image in

the future because the plane image itself cannot create a real-world-like environment for a long time period as demon-

strated by IMAX and OMNIMAX [17].

A plane scene can be perceived more realistically as the

screen/display panel size and the resolution increase. How-

ever, this realistic sensation from the large screen is not

induced from viewers’ eyes’ depth sensing mechanism, but

it is induced psychologically. This psychologically induced

depth sense often causes physical discomfort such as dizzi-ness, vomiting, and severe eye fatigue for many viewers.

Hence, plane images are not appropriate to create a real-

world-like environment to compete with the 3-D image

without regard to their size and resolution. This is why

many researchers have tried to develop 3-D imaging meth-

ods that can provide friendlier and more natural depth

sense than that induced psychologically. To obtain friend-

lier and more natural depth sense, 3-D images shouldprovide a focusable image depth and no exaggerated depth

sense, for maximizing the visual effect as in the stereo-

scopic images. A better 3-D image should be an equally

magnified/demagnified image of an object or a scene in all

three directions to fit into the screen size of a given display

panel. The multiview 3-D images can meet this require-

ment, but all multiview 3-D imaging methods known so far

are still laden with the eye fatigue problem, because theycannot generate the images with focusable depths. Holo-

graphic and volumetric images and optical images by imag-

ing optics have been known to provide 3-D images with the

real focusable image depth. But the volumetric and optical

images are too bulky for their displayable image sizes. So

they are impractical for home and office uses. This leaves

only the holographic image for possibly providing the na-

tural depth sense perceived through vergence and accom-modation cues of the viewers’ eyes along with parallaxes.

The supermultiview imaging methods [18], [19] can also

provide the hologram-like image by giving continuous

parallax and monocular depth. However, the depth sense

provided by them needs to be further investigated in

connection with that by the zebra hologram [20], which is

considered to be a true example of generating a super

multiview image.In this paper, the basic optical configurations of multi-

view 3-D imaging methods are defined based on those of

the MV and the IP. The configurations are extended to

super multiview imaging methods. Furthermore, the char-

acteristics of various super multiview and electrohologra-

phic imaging methods are analyzed and reviewed. These

methods could provide a continuous parallax and monoc-

ular depth cue, which are believed to create a more naturaland real-world-like environment by minimizing eye

fatigue.

II . MULTIVIEW IMAGING METHODS

It is considered that the multiview imaging concept was

developed in the early 20th century with the invention of

parallax barrier and the IP [21], though the concept has notbeen developed further due to the lack of proper display

panels for the multiview image display until recently. In

this concept, the images seen at different viewing

directions are simultaneously displayed spatially to provide

both binocular and motion parallaxes, and have the same

resolution. This concept was realized with the rapid devel-

opment of high-resolution LCD displays since the 1990s.

Son et al.: Three-Dimensional Imaging for Creating Real-World-Like Environments

Vol. 101, No. 1, January 2013 | Proceedings of the IEEE 191

Page 3: Three-Dimensional Imaging for Creating Real-World-Like Environments

As a result, various multiview imaging methods with theparallax barrier and the lenticular and microlens array as

the viewing zone forming optics (VZFO) have been

developed [15].

In this section, the basic optical configurations of the

multiview imaging methods are defined, and the char-

acteristics of the MV and the IP are analyzed based on the

definition.

A. Basic Optical Configurations of Various MultiviewImaging Methods

Multiview imaging methods known so far have verysimilar optical configurations. They are not too different

from each other. In fact, these configurations can be

grouped into three types such as single, parallel, and radial,

depending on the projection directions of each multiview

image in the display panel. In the single configuration,

only a projector is used to display multiview images; the

images are displayed time-sequentially by dividing the

output pupil plane of the projection objective. In this con-figuration, a high-speed strip shutter array and a high-

speed projector are combined [22], [23], and the pupil

plane of an objective lens is segmented either by a high-

speed shutter array or a scanner [24]. In the parallel con-

figuration, the images are projected in parallel with a

multiple projection optics array, but in the radial confi-

guration, they are projected but converged to a plane. The

word Bprojection[ is used here because images displayedon the display panel are viewed through the VZFO. The

representative method of each type is the IP for the parallel

configuration and the MV for the radial configuration.

These methods provided a theoretical basis for other mul-

tiview 3-D imaging methods, such as a point light sourcearray [25], projector arrays [26], a focused light array [27],

multiple imaging [28], and the zebra hologram. In the

image display point of view, it is possible to consider that

both MV and IP are basically formed by a 2-D array of

projectors to display images with a full parallax, i.e., pa-

rallaxes in both horizontal and vertical directions, though

they are not independent of each other. The projectors are

arranged in the radial configuration for the MV and in theparallel configuration for the IP.

B. The Optical Configurations of the MV and the IPThe basic unit of the IP and the MV is an elemental

lens/optics (for the case of the parallax barrier, Boptics[will be more appropriate; however, Blens[ will be used

throughout this paper for convenience) and an image

under this lens. It is typical that the image is located at the

focal plane of the lens in both methods, and it is identified

as an elemental image in the IP and a pixel cell in the MV.

The main difference between the IP and the MV lies in thesize of the image. It is typical that the height and the width

of the image are the same as those of the elemental lens for

the IP, but are slightly larger or smaller for the MV. When

making an array with the units having the same param-

eters, the images from different units are joined together

to be on a flat plane without overlapping the same way as

the elemental lenses in both methods. By this joining, the

image in a unit can be imaged by the elemental lenses of itsneighboring units in both methods, as specified by the

arrow lines in Fig. 1. The line connecting the centers of the

lens and the image in each unit becomes parallel to those

of other units in the IP. But in the MV, the image position

Fig. 1. Optical configuration of (a) the IP and (b) the MV.

Son et al. : Three-Dimensional Imaging for Creating Real-World-Like Environments

192 Proceedings of the IEEE | Vol. 101, No. 1, January 2013

Page 4: Three-Dimensional Imaging for Creating Real-World-Like Environments

in each unit should be shifted slightly to the left side or tothe right side, relative to the elemental lens position. As a

result, the line connecting the centers of the image and the

lens becomes no longer normal to them, except the one in

the center of the array for the case when the number of the

units in a line in the array is an odd number. By this

shifting, the lines are converged to a common point.

Hence, the units in the array are aligned in parallel in the

IP but radially in the MV.Being radial, the expanded image from each unit is

completely superposed with those from other units at the

parallel plane to the unit array, which contains the com-

mon point, in the MV. Fig. 1 represents the basic optical

geometries of the MV and the IP. It also shows that pixel

cell/elemental image can be imaged by neighboring ele-

mental lenses to its front. Side viewing zones are formed

by this imaging.

C. Differences Between the MV and the IPOther differences between the MV and the IP lie in the

image compositions and sources of the pixel cells and

elemental images. Typically, the sources for the MV and

the IP are a multiview camera array and the unit array,

respectively [29]. In the IP, each projection unit also works

as a camera. Hence, the elemental image consists of animage seen through the elemental lens in its front. How-

ever, for the pixel cell, the multiview images are arranged

such that different view images are separately viewed at

the parallel plane named as the viewing zone cross section.

Fig. 2 is the viewing zone forming geometry of the MV

[Fig. 2(a)] and the IP [Fig. 2(b)] without considering the

side viewing zones. The arrow lines represent the pro-

pagation directions of pixels in the pixel cells/elementalimages. The same color arrow lines represent the same

position pixels in the pixel cells/elemental images. Since

the image points are continuously expanding, the areas

surrounding the crossing points of the arrow lines repre-

sent the regions where a mixed image composed of pixels

from different pixel cells/elemental images keeps its com-

position at each region. The mixed image is the image

projected to the viewers’ eyes at the region. For Fig. 2(a),the same color lines cross each other at the viewing zone

cross section. This indicates that different view images are

separately viewed at the area surrounding the crossed

points and the number of pixels in a pixel cell corresponds

to the total number of different view images implemented

in the display panel. The number of different view images

in the MV is in most cases smaller than that of pixels in an

elemental image, but there cannot be a notable differencebetween them because it is also possible to put a view

image in each pixel cell. These differences between the

MV and the IP are in the images projected to the viewers’

eyes at different locations of a viewing zone. The viewing

zone is the spatial volume in front of the VZFO, where

viewers perceive certain depth senses with images

composed of a pixel from each pixel cell/elemental image

in the display panel/screen. It is defined as the commonfield of view of left- and right-most elemental lenses in the

horizontal direction, and top and bottom elemental lenses

in the vertical direction.

As mentioned before, there are regions where different

view images are separately viewed in the MV [30]. The

area of the region for each view image will be reduced as

the number of different view images increases because of

the limited total area of the regions. Other than theregions, different color lines cross each other, as shown in

Fig. 2(a). This means that mixed images of neighboring

view images appear in these regions. The mixed number of

different view images increases as the viewing locations

are farther away from the regions. At the farther distance

from the regions, all different view images in the display

panel are mixed with a pixel base. When the number of

pixel cells in a horizontal/vertical direction of the panel ismore than that of horizontal/vertical pixels in a pixel cell,

the pixel base mixing of all different view images will be

repeated. This shows that viewers in the viewing zone will

perceive much more different images than the number of

multiview images on the display panel/screen. For the IP

case, all elemental images are projected in parallel. This is

manifested by the parallelism between the same color

lines, as shown in Fig. 2(b). The expanded images of theelemental images are distanced by the pitch of an ele-

mental lens. Hence, the viewing zone in the IP is defined

as the common field of view of all projection units in the

display panel/screen.

The distance from VZFO to the onset position of the

common field of view is linearly proportional to the

display panel size, because all elemental lenses in the

VZFO have the same field of view angle. Since each pixelimage expands linearly with the distance from the VZFO,

the horizontal (vertical) size of each pixel image can

exceed that of the display panel. In this case, the expanded

image of a certain numbered pixel from each elemental

image will partly overlap with those of the same numbered

pixels from other elemental images in the display panel/

screen as in the MV, when pixels in each elemental image

are numbered the same way. The total number of theoverlapped regions will be the same as the number of

pixels with an elemental image. The area of each

overlapped region will be larger as the distance increases

and/or the display size becomes smaller. At the overlapped

regions, only the same color pixels will be viewed.

The images projected to the viewers’ eyes at other than

the overlapped region will be mixed the same way as in the

MV. This shows that the viewing zone of the IP has almostthe same structure as that of the MV, at least from the

onset position of the common field of view to the place

where the overlapped regions are formed. The distance

from VZFO to the places where the overlapped regions

appear will increase as the field of view angle becomes

smaller and/or the display panel size increases. Hence, the

overlapping regions may not be obtained with the IP when

Son et al.: Three-Dimensional Imaging for Creating Real-World-Like Environments

Vol. 101, No. 1, January 2013 | Proceedings of the IEEE 193

Page 5: Three-Dimensional Imaging for Creating Real-World-Like Environments

the size of each elemental image is small and/or the

display panel is big.

In the MV and the IP, the smallest size of dividable areasfor differently mixed images and different view images will

be defined by the diffraction effect caused by each

elemental lens in the VZFO. The diffraction effect could

be a critical factor of defining the minimum size of the

viewing region for each mixed image. Due to the diffraction

effect, the viewing region with a smaller area than the spot

size determined by the effect cannot be discriminated

against. The diffraction spot size increases as the distancefrom the VZFO increases and the pitch size of the ele-

mental lens decreases. The spot size at the distance 1 m for

a 1-mm diameter lens is in the range of 0.86–3.2 mm2 for

the visible spectral range of 380–730 nm [31]. To cover all

spectral ranges, the size of a viewing region for each mixedimage should be more than 3.2 mm2 at the 1-m distance. As

the distance changes, the size changes in proportion to the

square of the distance when it is in the meter unit. It is also

possible to make both MV and IP display images having

only horizontal parallax. This is done by displaying a 1-D

array of pixel cells for the MV and of elemental images for

the IP. In the 1-D array of pixel cells, each pixel cell is

formed by a horizontal pixel line. Hence, the vertical linesformed by the pixel cells in a vertical line cell represent the

pixel lines of different view images.

Fig. 2. Ray geometries of (a) the MV and (b) the IP.

Son et al. : Three-Dimensional Imaging for Creating Real-World-Like Environments

194 Proceedings of the IEEE | Vol. 101, No. 1, January 2013

Page 6: Three-Dimensional Imaging for Creating Real-World-Like Environments

III . MULTIVIEW IMAGING METHODSSTEMMED FROM THE MV AND THE IP

The IP and MV configurations have been modified inseveral different ways by removing VZFO with the use of a

2-D array of point light sources, and increasing the number

of basic units and the number of pixels within a pixel cell/

elemental image. In this section, several multiview imag-

ing methods, such as the point light source array, the

focused light array, multiple imaging, and the zebra holog-

ram, are introduced and their characteristics are analyzed.

A. Point Light Source Array MethodThe 2-D array of point light sources allows building a

3-D image display. However, each light source should emit

the light with a certain diverging angle and the light should

not be diffused. The angle should cover at least a pixel cell/

elemental image [32]. The optical geometry of the point

light source array method is not different from those of the

IP and the MV, except the array’s position relative to the

display panel is reversed. The array can work as the LCD’s

backlight panel. Hence, the optical structure of the pointlight source array method is not different from the current

back-illumination-type light emitting diode (LED) LCDs.

The 2-D/3-D conversion can also be done with an extra

light source array in between the sources in the point light

source array [33]. The extra light source array should

provide a diffused light as the back-illumination-type LED.

The period of the light sources in the point light source

array should be slightly larger than the pitch of each pixelcell/elemental image for the MV version and the same for

the IP version. The main problem of this method is that it

is hard to obtain an ideal point light source array. To

display highly resolved images with the point light source

array, the area of the emitting surface of each light source

should be much smaller than a pixel in the display panel. A

way of obtaining this kind of point light source array is

using a collimated LD array combined with an array of amicrolens array. The size of the focused light beam

achieved this way can be less than a few micrometers. This

size can be considered as an ideal source, but this way

makes the overall system bulky and requires an extra

microlens array, which is otherwise used as a VZFO.

However, the array is not visible because it is in the back of

the LCD. This is desired to enhance the 3-D image quality

by making the structure of VZFO invisible. The currentsurface mount device (SMD) LED array can be used as the

point light source array, but its surface area is almost

comparable to a pixel size of an LCD. The active flat panel

displays such as organic light emitting diode (OLED) and

field emission display (FED) and even an optical fiber array

can also be used [32].

B. Focused Light Array MethodThe point light source can be replaced by a point image

array [18], [34]. In this case, no display panel but a diffuser

plate is needed. The focused light array method is anexample of point image array methods, though it can pro-

vide horizontal parallax only (HPO). In this method, pixels

from each corresponding position of N different view

images are combined as an image point and then scanned

on a diffuser plate. However, according to the point light

source array method, the image point array should be

formed slightly behind the diffuser plate. The pixels are

angularly combined with the angles corresponding to theirimages’ viewing angles. Each image point emits N pixels,

directing to the angle corresponding to those of their view

images. This is optically equivalent to a 1-D pixel cell,

which is focused to diverge. Hence, N different view

images are displayed time-sequentially, and then each view

image is directed to its predetermined direction. Fig. 3

shows the image array in the focused light array method.

Each arrow line in Fig. 3 represents the propagation direc-tion of each pixel in an image point. Since each image

point consists of 8 pixels, the total image points on the

diffuser plate form eight different view images of different

angular directions. The image array on the diffuser plate in

Fig. 3 will generate the same multiview image as the 1-D IP

with elemental images having the same image format as

the pixel cells, i.e., when a pixel line from each elemental

image forms an image with those from other elementalimages, which point to the same direction as the pixel line.

This is manifested by the fact that the same colored lines

point to the same angular direction, as shown in Fig. 2(b).

The image point array on the diffuser plate is optically

equivalent to the elemental lens array, which can be re-

presented by a nodal point array, though it cannot produce

any side viewing zone. This is because the image in the

back focal plane of a lens is considered to be expandingfrom an image point centered at the nodal point of the

lens. The absence of the side viewing zone means no

pseudoscopic viewing region in the point image methods.

Since each pixel composing an image point continuously

expands as the pixels in the MV and the IP, each view

image will not be discrete, as shown in Fig. 3. The images

projected to the viewers’ eyes at the common spatial vol-

ume formed by views one and eight images will be almostthe same as those at the viewing zone of the MV.

C. Multiple Imaging MethodThe multiple imaging method is not different from the

focused light array, except all different view images are

spatially aligned in a horizontal direction to appear at a

common diffuser plate. In this method, images are colli-

mated and projected to the diffuser plate with the anglescorresponding to their view directions. All the projected

images are completely superposed to each other at the

diffuser plate. Each image’s propagation direction is the

same as in Fig. 3. The collimation causes the gap between

two adjacent arrow lines of the same color to be a pixel

width of each view. Also, the spatial volume having the

shape of a triangular bar, which is formed by the active

Son et al.: Three-Dimensional Imaging for Creating Real-World-Like Environments

Vol. 101, No. 1, January 2013 | Proceedings of the IEEE 195

Page 7: Three-Dimensional Imaging for Creating Real-World-Like Environments

area of the diffuser plate and the outer side of each left-

most and right-most image, is the place where all different

view images are viewed simultaneously. However, mostparts of this volume cannot work as a viewing zone because

at least two different view images share the same pixel

position. This pixel position sharing can also occur at the

outside of the triangular bar, though the volume is small.

Since the farthest distance extended by the triangular bar

is determined by the width of the diffuser plate’s active

area and the crossing angle of the left-most and right-most

images, the volume can be minimized by increasing theangle for a given active area size of the diffuser plate. The

collimating also causes each view image to be completely

separated from its neighboring view images as the distance

from the diffuser plate exceeds that determined by the

crossing angle of two adjacent view images and the width

of the diffuser plate’s active area. The distance will be

increased with the smaller crossing angle. Hence, the

viewing zone of this method is a part of the triangular bar.On the outside of the triangular bar, not all different view

images are mixed to form images to be projected to the

viewers’ eyes. The number of different view images parti-

cipating in forming the mixed images will be reduced as

the distance from the diffuser plate increases.

D. Zebra HologramIn the focused light array method, if each image point is

made of a plane image instead of a horizontal line of the

image and all image points are presented simultaneously,

the image pattern in the diffuser plate becomes the same as

that in the surface of a reconstructed zebra hologram. The

zebra hologram is composed of a 2-D point hologram array

as in a holographic memory. Each point hologram is re-

corded with a view image in a 2-D multiview image set.

Hence, the hologram is a 2-D stereo hologram. When the

zebra hologram is reconstructed, the point hologram arrayis transformed into a 2-D image point array. This image

point array represents the multiview image set. Hence, the

zebra hologram is just a hologram version of the IP and it

works the same way as the focus light array. The zebra

hologram provides full parallax information with much

more image points than the focus light array and with each

image point having resolution more than a UHD. How-

ever, it is questionable whether the hologram can provideaccommodation and vergence to the viewers’ eyes as the

typical hologram does. Fig. 4 shows the reconstructed

multiview images of the zebra hologram for the case when

each view image consists of 5� 3 pixels. Fig. 4 is just a full

parallax version of Fig. 3. The reconstructed image of the

zebra hologram can be viewed with a large viewing angle if

each point hologram is recorded with a high numerical

aperture (NA) objective. It will also be possible to recordzebra hologram such that the point image is projected not

in normal view but with a certain incidence angle to the

photographic plate. This will make possible to view the

hologram from a side.

The increasing number of image points will require a

bigger size photographic plate. This increase will lead to

the increase in the resolution of the images projected to

the viewers’ eyes and the diminishing of the overlappedregions for the same numbered pixels mentioned in

Section II-C due to the reduced size of the regions. The

appearance of the viewing zone at a farther distance from

the hologram will not cause any resolution problems, be-

cause the hologram size increases accordingly. The in-

creasing number of pixels within each image point will

Fig. 3. Focused image array: HPO.

Son et al. : Three-Dimensional Imaging for Creating Real-World-Like Environments

196 Proceedings of the IEEE | Vol. 101, No. 1, January 2013

Page 8: Three-Dimensional Imaging for Creating Real-World-Like Environments

make more crossing lines in the geometry, shown in Fig. 2.

As a consequence, the area of the place for a mixed image

will decrease accordingly and more places for images with

different compositions will be created. It is expected that

the area size can be reduced to less than the pupils’ size of

the viewers’ eyes. In this case, each eye of the viewer will

cover at least two adjacent places in both horizontal andvertical directions, and at least two differently mixed

images in each direction will get into each eye of the

viewer. The compositions of the two adjacent mixed

images along a line within the common field of view are in

the form of 1222333444 and 1122233344, respectively, for

the case when ten image points of 4 pixels for each point

are aligned horizontally on the hologram plane. The num-

bers represent the pixel number order when the pixels ineach image point are numbered the same way as the other

image points. The compositions do not reveal much nota-

ble difference between them. The question is whether the

differences are good enough to make each eye of the

viewer perceive them as separated. It is also noticed that

the second numbers 2 and 1, the fifth numbers 3 and 2, and

the eight numbers 4 and 3 are the adjacent pixels in the

same image points. Hence, each of them can be consideredas a pixel by the viewers’ eyes. The image compositions of

the mixed images in other lines in the common field of

view will be very similar to the compositions above.

The compositions indicate that the images projected

simultaneously to each eye of the viewer are very similar to

each other. This is known as the supermultiview condition

in which the 3-D images provide continuous parallax as in

a hologram [18]. The continuous parallax is the main

characteristic of the image generated by a hologram [35].

The continuous parallax means that the image changes

smoothly as viewers change their viewing directions as in

real world. The minimum condition of providing the con-

tinuous parallax is that at least two different view images in

the horizontal direction should be projected simulta-

neously to each eye of the viewer. This is a basic conditionto be the supermultiview and it can ease the conflict be-

tween convergence and accommodation [36].

In the typical multiview imaging methods, a viewer is

getting both binocular and motion parallaxes from a set of

images with a perceivable disparity between adjacent

images. The only difference between the MV and the super

multiview is the number of images getting into each eye of

the viewer. However, the super multiview claims that itcan provide a depth sense even with one eye as the holo-

gram [37]. There is an experimental evidence that can

supports the claim [38]. When a set of multiview images

from five different bars in different depths are sequentially

projected by a scanner to an output pupil plane of a

camera objective, all these images are fused as an image

and each bar is focusable with changing the focus distance

of the camera. Fig. 5 shows the focusing effect of a camerawhen a multiview images are projected to its objective’s

pupil plane. The longest and shortest size bars correspond

to the closest and farthest bars, respectively, from the

camera for an input image. This is probably evidence of

getting monocular depth sense with perceiving many

fusible different view images simultaneously.

When the number of pixels within each image point is

further increased in the zebra hologram, the difference

Fig. 4. Focused image array: full parallax.

Son et al.: Three-Dimensional Imaging for Creating Real-World-Like Environments

Vol. 101, No. 1, January 2013 | Proceedings of the IEEE 197

Page 9: Three-Dimensional Imaging for Creating Real-World-Like Environments

between the compositions will be further diminished and

more images will get into each eye of the viewer. When a

parallel plane to the hologram plane, in the common field

of view, is taken, a horizontal line of this plane is mostly

segmented into more than the horizontal resolution of animage point. If the length of the horizontal line is 1 m

and the resolution is 4000, there will be four different

images at every 1 mm. When the pupil’s diameter of the

viewers’ eyes is 3 mm, the total of 12 different images will

get into the eye. The supermultiview condition will be

satisfied even with 670 pixels in the horizontal direction

of the image point in the above case. This means that the

zebra hologram can provide the monocular depth senseaccording to the above experiment. However, it still

needs to be verified whether the Zebra-hologram-type

displays can provide monocular depth sense and contin-

uous parallax, because the composition of the mixed

images is different from that of the images from a camera

array, and the disparity between the adjacent mixed

images is uniform as in the array and it can be

quantifiable as the disparity from the array with a certaincamera distance.

E. Electronic Versions of SupermultiviewImaging Methods

The zebra hologram is just a hologram providing a

supermultiview image. It is not a display. To be a display,

the images on the panel/screen could be changed electro-

nically. Building the zebra-hologram-type display is notdifficult, because it is necessary to replace each image

point with the image point from an image projector. The

image of the projector is collimated by collimating optics

and then the collimated beam is focused as an image point

[34]. The axes of the collimating optics for other projectors

are aligned in parallel, as in Fig. 6(a). The same kind of

displays can also be realized by replacing the point light

source array with a point image array, as mentioned in

Section III-B. Fig. 6(b) shows the display based on the MV

optical geometry [39]. Fig. 6 shows the basic optical geo-

metry of building 3-D displays for a supermultiview image.

Building this type of display will be very costly even withthe picoprojectors. The commercially available picopro-

jectors are found with extended graphics array (XGA) re-

solution, i.e., 1024 � 768 [40]. The resolution will be

enough to satisfy the supermultiview condition. The high-

speed projectors such as the digital micromirror device

(DMD) [41] can also be used in building the display. In

this case, the number of projectors can be reduced, but

high-speed projectors, a demagnifying and replicationoptics as in QinetiQ holographic system, are needed [42].

The display structure of supermultiview imaging

methods can also be used to display holographic images.

In this case, the diffuser plate should be replaced by

screens having the nature of a varying refractive index with

light intensity, and each image point displays a part of

hologram instead of a view image. Polymer-dispersed

liquid crystal (PDLC) [43] is the kind of a material for thescreens, because its refractive index changes in proportion

to the intensity of the incoming light.

IV. ELECTROHOLOGRAPHICIMAGING METHODS

Holography has been considered superior to any other 3-D

imaging method known today because 1) it allows to pho-tograph and display 3-D images with real depths, 2) it can

fit into the plane image format, and 3) a holographic image

is as natural as the image which viewers perceive every day

from their surroundings. Hence, it can provide accom-

modation and vergence along with both parallaxes of

binocular and motion to the viewers’ eyes. No eye fatigue

will be caused. This is why it is considered that the

Fig. 5. Monocular depth cue.

Son et al. : Three-Dimensional Imaging for Creating Real-World-Like Environments

198 Proceedings of the IEEE | Vol. 101, No. 1, January 2013

Page 10: Three-Dimensional Imaging for Creating Real-World-Like Environments

electroholographic image will be one of the mainstream

3-D image technologies in the future, which can create a

real-world-like environment. The main goal of the elec-troholography, i.e., the electronic display of a holographic

image, is developing a display system that can present a

full-color moving holographic image of any size as in the

present flat panel displays. For this goal, various electro-

holographic methods have been developed since the mid-

1960s, but achieving this goal still seems to be in the far

distance due to a lack of proper means of displaying holo-

graphic images. The data amount contained in a hologra-phic image for display is still too high and dense for the

displays available in the market. This is the reason why

mostly computer-generated hologram (CGH), stereoholo-

gram, and near on-axis hologram are researched so far for

the electroholography. Some progress has been made in

the electroholography during the last 50 years, such as:

1) displaying full-color holographic images synthesized from

the time-sequential arrangement of computer-generatedline holograms having a horizontal parallax only on the

acousto-optic modulator (AOM) [44]; 2) displaying holo-

graphic images from charge-coupled device (CCD) cameras

on the AOM and using a short pulse laser to freeze the

acoustic wave in the AOM [45], [46]; 3) displaying CGHimages on display chips such as the liquid crystal on silicon

(LCoS), LCDs, LCD displays, and various spatial light mo-

dulators (SLMs); 4) developing more efficient CGH

calculating algorithms; 5) implementing a hardware for

the fast calculation of CGH; and 6) recording hologram on

the CCD. Most of these progresses were influenced by the

holographic video development from the Massachusetts

Institute of Technology (MIT) Media Laboratory, whichbegan the electroholography work in 1986 and continued it

until 2000. The lab displayed 15 � 7.5 � 7.5 cm3 size full-

color holographic images that were reconstructed from a

hologram with 256 000 � 144 fringes and set a new stan-

dard of displaying a holographic image on the AOM. The

horizontal fringe number 256 000 is still the record no other

devices researched so far.

The development of electro-holographic display tech-nology moves steadily, but its results are not ready to be

compared with those of the current multiview 3-D

Fig. 6. Electronic version of the zebra hologram: (a) parallel type and (b) radial type.

Son et al.: Three-Dimensional Imaging for Creating Real-World-Like Environments

Vol. 101, No. 1, January 2013 | Proceedings of the IEEE 199

Page 11: Three-Dimensional Imaging for Creating Real-World-Like Environments

imaging. The quality and the size of the holographic imagefrom the electroholography are still far behind those of the

imaging. However, the development will go on with the

strong support of high-resolution and high-density LCDs.

There already exists an LCD with the UHD resolution, i.e.,

7680 � 4320 with a pixel size of 4.8 �m [47]. This size

exceeds the displayable fringe period of the AOM.

A. AOM-Based Electroholographic Imaging MethodsThe AOM was originally suggested to display the TV

signal. To eliminate image flow due to the signal flow in

the AOM, an acousto-optic deflector (AOD) was used to

suppress the movement, as shown in Fig. 7 [48]. In Fig. 1,if the acoustic wave speed in the AOM, the distance be-

tween the AOM and the AOD, and the scanning time of

one image line in the AOM are Va, d, and T, respectively,

the diffraction angle �0 of the beam through the AOD

should satisfy the relationship �0 ¼ VaT=2d for the com-

plete suppression of the movement. Instead of the AOD,

MIT used a rotating polygon mirror. It is also possible to

suppress the movement by the use of a very short pulselaser with a high pulse repetition rate [45]. In electroholo-

graphy, a holographic fringe signal is input to the AOM

instead of the TV signal. Since there is not any means of

sampling a real hologram except CCD, CGH was used in

the holographic video [49].

The calculated CGH fringe data are transformed to

analog signal lines for the AOM. When this signal is input

to the AOM as an acoustic wave, it continuously modulatesthe refractive index of the AOM as it propagates to produce

moving holographic fringes that represent the fringe dis-

tribution on the line. This signal can be digitized by as-

signing 1 or 0 to the intensity above or below the threshold

intensity level, respectively. When the digitized signal is

input to the AOM, the AOM becomes a grating with avariable grating period. The quality of the AOM for the

holographic fringe display is represented by the space-band

width product and diffraction efficiency. The product

depends on the active aperture size, the acoustic wave

propagation speed, and bandwidth of the AOM. The de-

sirable AOM should have the active aperture size com-

parable to the actual hologram size, and the bandwidth

multiplied by the propagation time of the acoustic wavethrough the length of the active aperture should be com-

parable to the number of fringes within the width of the

hologram. This means that the speed divided by the band-

width should have the same value of the fringe period of

the hologram. If the beam crossing angle of reference and

object waves is 30�, the number of fringes within 1-mm

width of the hologram and the fringe period is about

818 and 1.2 �m, respectively, for the He–Ne laser (� ¼0.6328 �m). As far as the value of the speed divided by

the bandwidth stays the same, the higher speed AOMs will

be better than lower ones, because more hologram frames

and/or more vertical lines for each frame can be displayed,

and time-wise multiplication of AOM width is possible. It

is also important to consider the diffraction efficiency,

which is represented by the figure of merit in the AOM

[50]. The value should be as high as possible. One moreparameter that should be considered in the AOM is the

signal spreading as it propagates. The spreading will cause

the diffraction efficiency to decrease because the acoustic

power density decreases. The result will be the continuous

decreasing of image brightness from the start of the image

to the end. Furthermore, signal mixing between different

channels will be caused when the AOM operates at a

multichannel mode. Hence, the spreading is one impor-tant parameter of determining the active aperture size.

Fig. 7. Zenith radio corporation version AOM-based TV: polygon mirror in scophony TV projection system is replaced by the AOD.

Son et al. : Three-Dimensional Imaging for Creating Real-World-Like Environments

200 Proceedings of the IEEE | Vol. 101, No. 1, January 2013

Page 12: Three-Dimensional Imaging for Creating Real-World-Like Environments

As the AOM for electroholography, tellurium dioxide(TeO2) operating in shear mode is still good because of its

high space-bandwidth product and diffraction efficiency

[49]. The TeO2 can display fringes with a 12-�m period.

This period corresponds to two pixel periods. Hence, it

corresponds to a 1-D display with a 6-�m pixel period.

There are still just few displays with this pixel period. LCoS

is currently a display chip with the smallest pixel size:

4.8 �m. However, the period corresponds to the beamcrossing angle range of 2.4�–4.2� for the visible spectral

range (� ¼ 0.4–0.7 �m). This angle range is still small

for the complete separation of zeroth- and first-order

diffracted beams. The viewing angle is defined as the angle

range where a viewer can view the reconstructed image

without interference from the zeroth-order beam. Hence,

the beam crossing angle is the same as the viewing angle

for the collimated reconstructed beam case. So a demag-nifying optics is required to separate the zeroth- and first-

order beams. This will increase the viewing angle by

1/demagnifying times. However, the demagnifying optics

causes several critical problems such as reducing the holo-

gram size, making the system bulkier, and limiting the

displayable hologram size. Hence, it is better not to use the

demagnifying optics. To eliminate the demagnifying op-

tics, AOM is needed with a higher space-bandwidth pro-duct so that the displayable fringe periods should be much

lower than 12 �m. A period of 1 �m will be desirable for

displaying the off-axis hologram. The aperture length of

the AOM should also be longer to display larger size

holographic images. There is no doubt that the AOM is a

good display material for displaying holograms but a better

AOM should be developed in the future. Otherwise, it

cannot compete with the LCD displays. Since LCD israpidly developing these days, the size and the resolution

will be further reduced and increased, respectively. The

AOM can no longer compete with the LCD, unless the

AOM is greatly developed to hire smaller period fringes.

The MIT Media Laboratory introduced a new holographic

video based on the LiNbO3 AOM [51]. This AOM has more

than 1-GHz bandwidth though the aperture size is small.

B. Computer-Generated Hologram (CGH)It is still difficult to take a hologram of an outdoor

scene due to the difficulties in setting up the hologram

recording. However, the CGH allows obtaining the holo-

gram of a natural scene. The natural scene is taken by a

multiview camera array, the basic unit array of the IP [52],

and a depth camera [53]. These cameras are used to obtain

depth information of the objects in the scene. From theimages of the camera array and the basic unit array, the

depth information of the objects in the images is calcu-

lated. The depth camera provides depth information of the

objects in the scene inherently. Using this depth informa-

tion, the CGH is calculated. There are many methods of

generating the CGH [54]. Most of these CGHs are calcu-

lated by the off-axis-type phase calculation and Fresnel

zone plate methods [55]. In the process of calculations,many different calculation algorithms have been devel-

oped. An algorithm is even implemented in field-

programmable gate array (FPGA) chips to expedite the

calculation [56].

C. Electroholographic Methods Based on theLCoS, the LCD, and the SLM

The LCD was first used to display the hologram inthe mid-1990s [57]. Five LCDs, each having 3600 �960 pixels, were optically combined to be aligned in the

horizontal direction. The pixel sizes of the LCDs were in

the 30-�m range. The image was barely watched by an eye.

Since that time, electroholography has been developed

mainly by the use of projection chips such as the SLM, the

LCoS, and the LCD [58], because they had pixels with sizes

and resolutions for barely displaying holograms. Using theLED as the illumination source [59], combining chips to

display the full-color hologram generation [60], multiplex-

ing techniques to effectively enlarge hologram or viewing

angle [61], and color hologram display with a single display

chip [62] were the major problems that were investigated.

One of the noticeable developments is the QinetiQ

system. This system was composed of four horizontal

channels of the optically addressable SLM (OASLM). EachOASLM channel displays a segment of a hologram. The

hologram is equally divided into four sections in the

horizontal direction and each section is divided into 5 � 5

segments. The OASLM channel is composed of a high-

speed projector for projecting each segment of a section of

the hologram in an appropriate time sequence, and a

replication optics for demagnifying the fringe pattern and

directing the pattern to the appropriate place on theOASLM. 5� 5 images from the projector are loaded on the

OASLM, and then they form one section of the hologram

on the OASLM. The refractive index of the OASLM varies

according to the intensity distribution of the images. By

combing four OASLM channels, the original one is loaded

on the OASLM. By this, a hologram with 10 240 � 2610

fringes is formed.

Lately, LCD has again been used to display the holo-gram [63]. The SeeReal holographic display has the LCD

display with the size of 20 in. The uniqueness of this dis-

play is defining the boundary of the image space where the

reconstructed image of the hologram displayed on the LCD

can appear, by setting virtual windows for the viewer’s eyes

and adopting an eye tracking technique to display the ap-

propriate piece of the hologram corresponding to new eye

positions. The maximum extendable depth range is deter-mined by the size ratio of the display and the virtual

window. Fig. 8 shows the working principle of the seeReal

display. The display panel is composed of subholograms.

Each subhologram generates a voxel of the reconstructed

image by the geometrical relationship between the subho-

logram and the virtual window. The depth of each voxel is

determined by the subhologram size when the size of the

Son et al.: Three-Dimensional Imaging for Creating Real-World-Like Environments

Vol. 101, No. 1, January 2013 | Proceedings of the IEEE 201

Page 13: Three-Dimensional Imaging for Creating Real-World-Like Environments

virtual window is fixed. Recently a full-color holographic

image has been displayed on LCDs of the UHD resolution,

i.e., 7680 � 4320. A pixel size of the LCD is 4.8 �m.

This pixel size is smaller than that of the TeO2 AOM. The

LCD size is 36.8 � 20.7 mm2. Three LCDs are used to

display the full-color image. The LCD size is a bit small,

but the resolution corresponds to more than eight LCDs

used for the first LCD-based holographic image display.Combining many of the LCDs, a large holographic display

can be realized. In the future, the widespread use of this

type of LCDs or denser and high-resolution LCD will be

expected.

The digital holography allows obtaining a real object

hologram with a CCD camera [64]. Hence, it can reduce

the dependence on the CGH in obtaining a holographic

image. However, the current digital holography cannotproduce a high-quality holographic image due to the small

surface area of the current CCD. The obtainable depth is

also very small and the hologram is very noisy because of

the optical setup involved in the digital holography.

V. INTERACTIVE TECHNOLOGIES

The real-world-like environment created by a 3-D image

provides users with the possibility of interacting with theobjects in the image. Since the image is just a replica of

the real scene, users can interact either with the real

world or a scene itself through the scene. To interact

with the real world, the display should detect all the

actions of the users and transfer them to (an) agent(s)

who will act the same way as the users. For more accu-

rate interactions, the environmental conditions of and

reactions from the real world, perceived by the agentwhile repeating the users’ action, need to be transferred

to users. For this purpose, many sensors including visual,

acoustic, sensory, various mechanical, and physical sen-

sors will be needed. One of the most difficult problems is

sending and feeling the texture information of real ob-

jects. To interact with the objects in the scene, the same

sensors used in the previous case should also be used, but

to feel the reaction, the information should entirely comefrom the data base.

An interaction with a scene described by a 3-D image

has already been used in many places such as the nuclear

power industry, medical surgery, gaming industry, etc. In

the nuclear industry, robots with stereo vision are working

inside the dangerous facilities which humans can hardly

access. They are directed by the operators outside the

facilities. During a medical surgery, surgeons direct asurgery robot by a joystick through a 3-D monitor. In the

gaming industry, word 4-D by combining a 3-D image with

the physical simulation has been used widely. The inter-

action with the 3-D image can be performed with one or all

simple probes: a 3-D mouse, a voice, a gesture, a finger,

and a facial expression. However, to secure the accuracy

in the interaction, a 3-D mouse function should be real-

ized to pinpoint a specific position of objects/scenes withinthe 3-D scene on the display panel. For the accuracy point

of view, the 3-D mouse will be the most effective tool for

the interaction. For the case of the communication, gazing

into each other’s eyes between partners and exchanging

sensoria and information about the atmosphere in which

each communication partner is are other essential func-

tions to be implemented to provide the real feeling of

being in the same place, face to face with the communi-cation partners. For the first purpose, three cameras were

used to synthesize the user’s face and track the user’s eye

movement [65]. For the second purpose, a special compu-

ter chip was implanted in the user’s nerve system to

transmit signals which will transfer the partner’s atmo-

sphere and other sensory feelings.

VI. CONCLUSION

Electroholography and supermultiview imaging methods

will be main research subjects in the 3-D imaging displays

in the future, because creating 3-D images as natural as

those which we perceive every day in our surroundings has

been the main goal of the 3-D display since its birth. Any

3-D display which will cause physical discomfort to viewers

will not take the place of flat panel displays. In addition todiscomfort, the 3-D image quality should be comparable to

that of the plane image. Among currently known 3-D

imaging methods, only the holographic image is considered

to achieve this goal. However, the electronic realization of

the image is still in the early stages of its development due

to the lack of proper means of displaying. The development

of digital holography, high-speed projectors, and high

density and resolution display will spur this realization, butthey still need more time to meaningfully affect the reali-

zation of the image. Hence, the electroholographic tech-

nology cannot be perfected within a short time period.

The supermultiview imaging methods are expected to

achieve this goal. The experimental results indicate that

they can provide continuous parallax and monocular depth

sense as in a hologram. But the results are too few to

Fig. 8. Working principle of the SeeReal holographic display.

Son et al. : Three-Dimensional Imaging for Creating Real-World-Like Environments

202 Proceedings of the IEEE | Vol. 101, No. 1, January 2013

Page 14: Three-Dimensional Imaging for Creating Real-World-Like Environments

believe it without any doubt. More investigations shouldbe done in parallel with perfecting them. Realization of

these methods is much easier than the electroholography

because most materials for realizing the methods are

commercially available.

Interacting with the scene described by a 3-D imagewill be more refined in the future. The interaction will be

an essential technology in the realization of the reality

communication, which is considered to be the superior

goal of communication. h

RE FERENCES

[1] C. Cruz-Neira, D. J. Sandin, and T. A. Defanti,BSurround-screen projection-based virtualreality: The design and implementation of theCAVE,[ in Proc. Comput. Graph. (SIGGRAPH),1993, vol. 27, pp. 135–142.

[2] S. Peleg, M. Ben-Ezra, and Y. Pitch,BOmnistereo: Panoramic stereo imaging,[IEEE Trans. Pattern Anal. Mach. Intell.,vol. 23, no. 3, pp. 279–290, Mar. 2001.

[3] H. Shum and R. Szeliski, BStereoreconstruction from multi-perspectivepanoramas,[ in Proc. 7th Int. Conf. Comput.Vis., 2001, pp. 14–21.

[4] Y. P. Huang, G. Z. Wang, M. C. Ma,S. Y. Tung, S. Y. Huang, H. W. Tseng,C. H. Kuo, and C. H. Li, BVirtual touched3D interactive system for auto-stereoscopicdisplay with embedded optical sensor,[Proc. SPIEVInt. Soc. Opt. Photon., vol. 8043,pp. 80430O-02–80430O-18, 2011.

[5] K. Enami, BResearch on ultra-realisticcommunications,[ ECTI Trans. Electr. Eng.Electron. Commun., vol. 6, no. 1, pp. 22–25,2008.

[6] T. Izumi (Supervisor), ‘‘Fundamental of 3-Dimaging techniques,’’ (in Japanese), NHKSci. Technol. Lab., Ohmsa, Tokyo, Japan,1995, pp. 33–34.

[7] J.-Y. Son, B. Javidi, and K.-D. Kwack,BMethods for displaying three-dimensionalimages,[ Proc. IEEE, vol. 94, no. 3,pp. 502–523, Mar. 2006.

[8] Nash Information Services, The Numbers:Movie Avatar, (retrieved Oct. 15, 2010).[Online]. Available: http://www.the-numbers.com/movies/2009/AVATR.php.

[9] D. Suzuki, T. Fukmi, E. Higano, N. Kubota,T. Higano, S. Kawaguchi, Y. Nishimoto,K. Nishiyama, and K. Nakao, BCrosstalk-free3D display with time-sequential OCB LCD,[in Soc. Inf. Display (SID) Symp. Dig. Tech.Papers, 2009, pp. 428–431.

[10] J. Schultz, R. Brott, M. Sykora, W. Bryan,T. Fukami, K. Nakao, and A. Takimoto,BFull resolution autostereoscopic 3Ddisplay for mobile applications,[ in Soc.Inf. Display (SID) Symp. Dig. Tech. Papers,2009, pp. 127–130.

[11] S. Kim, B. You, H. Choi, B. Berkeley, D. Kim,and N. Kim, BWorld’s first 240 Hz TFT-LCDtechnology for full-HD LCD-TV and itsapplication to 3D display,[ in Soc. Inf.Display (SID) Symp. Dig. Tech. Papers,2009, pp. 424–438.

[12] Entertainment Technology Center,Glasses-Free 3D Display Using IntegralImagingVToshiba (9-view Lenticular),Sep. 11, 2010. [Online]. Available:http://www.etcenter.org/2010/09/glasses-free-3d-display-using-integral-imaging-toshiba-9-view-lenticular/.

[13] J.-Y. Son, B. Javidi, S. Yano, andK.-H. Choi, BRecent developments in3-D imaging technologies,[ J. DisplayTechnol., vol. 6, no. 10, pp. 394–403,Oct. 2010.

[14] C. Chen, Y. Huang, S. Chuang, C. Wu,H. Shieh, W. Mphepo, C. Hsieh, andS. Hsu, BLiquid crystal panel for highefficiency barrier type autostereoscopic

three-dimensional displays,[ Appl. Opt.,vol. 48, no. 18, pp. 3446–3454, 2009.

[15] J.-Y. Son and B. Javidi, BThree-dimensionalimaging systems based on multiview images,[J. Display Technol., vol. 1, no. 1, pp. 125–140,Jan. 2005.

[16] F. Okano, H. Hoshino, and I. Yuyama, BRealtime pickup method for a three dimensionalimage based on integral photography,[ Appl.Opt., vol. 36, pp. 1598–1603, 1997.

[17] P. D. Panabaker, G. W. Harris, andW. C. Shaw, BLarge format 3D motion picturesystems,[ in Proc. Symp. Three-DimensionalImage Technol. Art, 1992, pp. 49–55.

[18] Y. Kajiki, H. Yoshikawa, and T. Honda,BOcular accommodation by super multi-viewstereogram and 45-view stereoscopic display,[in Proc. 11th Int. Display Workshop, 1996,pp. 489–492.

[19] Y. Takaki and H. Nakanuma, BImprovementof multiple imaging system used for natural3D display which generates high-densitydirectional images,[ Proc. SPIEVInt. Soc.Opt. Photon., vol. 5243, pp. 43–49, 2003.

[20] M. A. Klug, C. Newswanger, Q. Huang, andM. E. Holzbach, BActive digital hologramdisplays,[ Zebra Imaging Inc., U.S. Patent7 227 674, Jun. 2007.

[21] H. E. Ives, BOptical properties of Lippmannlenticulated sheet,[ J. Opt. Soc. Amer.,vol. 21, pp. 171–176, 1931.

[22] A. R. L. Travis, S. R. Lang, J. R. Moore,and N. A. Dodgson, BTime-multiplexedthree-dimensional video display,[ in Soc. Inf.Display (SID) Symp. Dig. Tech. Papers, 1995,pp. 851–852.

[23] J.-Y. Son, V. G. Komar, Y.-S. Chun, S. Sabo,V. Mayorov, L. Balasny, S. Belyaev, M. Semin,M. Krutik, and H.-W. Jeon, BA multiview3 dimensional imaging system with fullcolor capabilities,[ Proc. SPIEVInt. Soc. Opt.Photon., vol. 3295A, pp. 218–225, 1998.

[24] S. A. Shestak, J.-Y. Son, H.-W. Jeon, andV. G. Komar, BSliding aperture multiview3-D camera-projector system and itsapplication for 3-D image transmission andIR to visible conversion,[ Proc. SPIEVInt. Soc.Opt. Photon., vol. 3012, pp. 96–106, 1997.

[25] S.-S. Kim, K.-H. Sohn, V. Savaljev, E. F. Pen,J.-Y. Son, and J.-H. Chun, BA full parallaxthree-dimensional imaging system basedon a point light source array,[ Jpn. J. Appl.Phys., vol. 40, pp. 4913–4915, 2001.

[26] V. I. Bobrinev, J.-Y. Son, S. A. Shestak, andH.-W. Jeon, BThe achromatized transmissiontype holographic screen for the stereo imagingand the multiview projection,[ Proc.SPIEVInt. Soc. Opt. Photon., vol. 2951,pp. 168–172, 1996.

[27] Y. Kajiki, H. Yoshikawa, and T. Honda,BHologram like video images by 45-viewstereoscopic display,[ Proc. SPIEVInt. Soc.Opt. Photon., vol. 3012, pp. 154–166, 1997.

[28] Y. Takaki and H. Nakanuma, BImprovementof multiple imaging system used for natural3D display which generates high-densitydirectional images,[ Proc. SPIEVInt. Soc.Opt. Photon., vol. 5243, pp. 43–49, 2003.

[29] J.-Y. Son, S.-H. Kim, D.-S. Kim, andB. Javidi, BImage forming principle of integral

photography,[ J. Display Technol., vol. 4, no. 3,pp. 324–331, Sep. 2008.

[30] J.-Y. Son, V. V. Saveljev, Y.-J. Choi, J.-E. Bahn,and H.-H. Choi, BParameters for designingautostereoscopic imaging systems based onlenticular, parallax barrier and IP plates,[ Opt.Eng., vol. 42, no. 11, pp. 3326–3333, 2003.

[31] E. Hecht and A. Zazac, Optics. Reading, MA:Addison-Wesley, 1979.

[32] J.-Y. Son, V. V. Saveljev, D.-S. Kim,Y.-M. Kwon, and S.-H. Kim, BA threedimensional imaging system based on LEDarray,[ Opt. Eng., vol. 46, no. 10, p. 103205-4,2007.

[33] ‘‘A display module in 2-D/3-D compatiblemultiview 3-D imaging system,’’ Korea,Patent 10-0567401-0000.

[34] T. Balogh, BThe Holovizio system,[ Proc.SPIEVInt. Soc. Opt. Photon., vol. 6055,pp. 60550U-1–60550U-12, 2006.

[35] M. C. Forman, N. Davies, andM. McCormick, BContinuous parallax indiscrete pixelated integral three-dimensionaldisplays,[ J. Opt. Soc. Amer. A, vol. 20, no. 3,pp. 411–420, 2003.

[36] F. Speranza, W. J. Tam, T. Martin, andL. Stelmach, BPerceived smoothnessof viewpoint transition in multi-viewpointstereoscopic,[ Proc. SPIEVInt. Soc. Opt.Photon., vol. 5664, pp. 72–82, 2005.

[37] Y. Takaki and N. Nago, BMulti-projection oflenticular displays to construct a 256-viewsuper multi-view display,[ Opt. Exp., vol. 18,no. 9, pp. 8824–8835, 2010.

[38] S.-K. Kim, D.-W. Kim, Y.-M. Kwon, andJ.-Y. Son, BEvaluation of the monocular depthcue in 3D displays,[ Opt. Exp., vol. 16, no. 26,pp. 21415–21422, 2008.

[39] ‘‘Display device based on point imagearray,’’ Korea Patent ApplicationNo. 10-2009-0010302.

[40] [Online]. Available: http://www.amazon.com/AAXA-M2-Projector-1024x768-Resolution/dp/B004BM2OS4

[41] L. Bogaert, Y. Meuret, S. Roelandt, A. Avci,H. De Smet, and H. Thienpont, BSingleprojector multiview displays: Directionalillumination compared to beam steering,[Proc. SPIEVInt. Soc. Opt. Photon., vol. 7524,pp. 75241R-1–75241R-10, 2009.

[42] C. Slinger, C. Cameron, and M. Stanley,BComputer-generated holography as a genericdisplay technology,[ IEEE Computer, vol. 38,no. 8, pp. 46–53, Aug. 2005.

[43] K. Beev, S. Sainov, T. Angelov, andA. G. Petrov, BInvestigation of Bragggratings recorded in polymer-dispersed liquidcrystal,[ J. Optoelectron. Adv. Mater., vol. 6,pp. 799–803, 2004.

[44] S.-H. Pierre, M. Lucentea, J. D. Sutter,R. Pappu, C. D. Sparrell, and S. A. Benton,BScaling up the MIT holographic videosystem,[ Proc. SPIEVInt. Soc. Opt. Photon.,vol. 2333, pp. 374–380, 1994.

[45] J. Y. Son, S. Shestak, S. K. Kim, andV. Epikhan, BA multichannel AOM for realtime electroholography,[ Appl. Opt., vol. 38,no. 14, pp. 3101–3104, 1999.

[46] H. H. Choi, S.-K. Kim, J.-Y. Son, andJ. W. Wu, BPulse laser electro-holography

Son et al.: Three-Dimensional Imaging for Creating Real-World-Like Environments

Vol. 101, No. 1, January 2013 | Proceedings of the IEEE 203

Page 15: Three-Dimensional Imaging for Creating Real-World-Like Environments

using interference fringe pattern capturedby a CCD,[ Appl. Opt., vol. 43, no. 30,pp. 5600–5606, 2004.

[47] T. Senoh, T. Mishina, K. Yamamoto,O. Ryutaro, and T. Kurita,BViewing-zone-angle-expanded colorelectronic holography system usingultra-high-definition liquid-crystal displayswith undesirable light elimination,[J. Display Technol., vol. 7, no. 7, pp. 382–390,Jul. 2011.

[48] L. M. Myers, BThe Scophony system:An analysis of its possibilities,’’ TV andShortwave World, pp. 201–294, Apr. 1936.

[49] S. A. Benton and M. Lucente, BInteractivecomputation of display holograms,’’ Proc.Comput. Graph. Int., T. Kunii, Ed., 1992,pp. 129–149.

[50] D. R. Pape, O. B. Gusev, S. V. Kulakov, andV. V. Molotok, BDesign of acousto-opticdeflectors,[ in Design and Fabrication ofAcousto-Optic Devices, A. P. Goutzoulis andD. R. Pape, Eds. New York: Marcel Dekker,1994, pp. 77–102.

[51] Q. Y. J. Smithwick, D. E. Smalley,V. M. Bove, Jr., and J. Barabas, BProgressin holographic video displays based onguided-wave acousto-optic devices,’’ Proc.SPIEVInt. Soc. Opt. Photon., vol. 6912,pp. 69120H-1–69120H-10, 2008.

[52] N. T. Shaked, J. Rosen, and A. Stern,BIntegral photography: White-lightsingle-shot hologram acquisition,[ Opt.Exp., vol. 15, pp. 5754–5760, 2007.

[53] T. Senoh, K. Yamamoto, R. Oi, T. Mishina,and M. Okui, BComputer generated electronicholography of natural scene from 2Dmulti-view images and depth map,[ in Proc.2nd Int. Symp. Universal Commun., 2008,pp. 126–133.

[54] W. J. Dallas, BComputer generatedholograms,[ in The Computer inOptical Research. Berlin, Germany:Springer-Verlag, 1980, pp. 291–366.

[55] T. C. Poon, K. B. Doh, B. W. Schilling,M. H. Wu, K. Shinoda, and Y. Suzuki,BThree-dimensional microscopy by opticalscanning holography,[ Opt. Eng., vol. 34,pp. 1338–1344, 1995.

[56] T. Shimobaba, A. Shiraki, N. Masuda, andT. Ito, BElectroholographic display unitfor three-dimensional display by use ofspecial-purpose computational chip forholography and reflective LCD panel,[ Opt.Exp., vol. 13, no. 11, pp. 4196–4201, 2005.

[57] K. Maeno, N. Fukaya, O. Nishikawa, K. Sato,and T. Honda, BElectro-holographic displayusing 15-megapixel LCD,[ Proc. SPIEVInt.Soc. Opt. Photon., vol. 2652, pp. 15–23, 1996.

[58] F. Yaras, H. Kang, and L. Onurai, BState ofthe art in holographic displays: A survey,[ J.Display Technol., vol. 6, no. 10, pp. 443–454,Oct. 2010.

[59] T. Ito, T. Shimobaba, H. Godo, andM. Horiuchi, BHologram reconstruction witha 10 �m Pixel-pitch refractive liquid-crystaldisplay by use of a light-emitting diode

reference beam,[ Opt. Lett., vol. 27,pp. 1406–1408, 2002.

[60] K. Takano and K. Sato, BColorelectro-holographic display using a singlewhite light source and a focal adjustmentmethod,[ Opt. Eng., vol. 41, pp. 2427–2433,2002.

[61] T. Nagai, Y. Yabe, and Y. Sakamoto, BA systemof enlarging visual field and viewing zonesimultaneously for electro-holography,’’Proc. SPIEVInt. Soc. Opt. Photon., vol. 6488,pp. 64880w-1–64880w-8, 2007.

[62] T. Ito and K. Okano, BColor electroholographyby three colored reference lightssimultaneously incident upon one hologrampanel,[ Opt. Exp., vol. 12, pp. 4320–4325,2004.

[63] S. Reichelt, H. Sahm, N. Leister, andA. Schwerdtner, BCapabilities of diffractiveoptical elements for real-time holographicdisplays,’’ Proc. SPIEVInt. Soc. Opt. Photon.,vol. 6912, pp. 69120P-1–69120P-11, 2008.

[64] U. Schnars and W. P. O. Juptner, BDigitalrecording and numerical reconstruction ofholograms,[ Meas. Sci. Technol., vol. 13,pp. R85–R101, 2002.

[65] R. Buschmann, PANORAMA: Package forNew OpeRational Autostereoscopic MultiviewSystems and Applications, Report No. AC092/SIE/Final Demo/DS/P/032/b1 of ProjectNo. AC092, Oct. 1998.

ABOUT THE AUT HORS

Jung-Young Son received the B.Eng. degree in

avionics from the Korea National Aviation Univer-

sity, Korea, in 1973, and the M.S. degree in

electronics and the Ph.D. degree in engineering

science from the University of Tennessee, Knox-

ville, in 1982 and 1985, respectively.

From 1980 to 1985, he was a Graduate

Research Assistant, and from 1985 to 1989, a

Research Scientist at the Space Institute, Univer-

sity of Tennessee. From 1989 to 2002, he worked

at the Korea Institute of Science and Technology as a Principal Research

Scientist in Optics, from 2002 to 2007, as a Research Professor at

Hanyang University, Seoul, Korea, and from 2007 to 2010, as a Chair

Professor at the School of Computer and Communication Engineering,

Daegu University, Kyungsan, Kyungbuk, Korea. He is currently a

Professor at the Biomedical Engineering Department, Konyang Univer-

sity, Nonsan, Chungnam, Korea. His primary interests are focused on 3-D

image displays, recording, and transmission, electro-holography, milli-

meter, IR, and spectral images for medical applications, and laser-based

optical instrumentations and measurements. He has more than 70 and

150 SCI Journal and conference proceeding articles, respectively, more

than 100 Korean articles, 17 books (one authored, six coedited, two

chaptered books, two coauthored books, four translated, and one short

course note), and more than 57 registered patents.

Dr. Son is a Fellow of SPIE and Optical Society of Korea, and is a

member of IEEE, Sigma Xi, Phi Kappa Phi, and the Optical Society of

America. He is also an Associate Editor of the OSA/IEEE JOURNAL OF

DISPLAY TECHNOLOGY.

Wook-Ho Son received the B.S. degree in com-

puter science from Yonsei University, Seoul,

Korea, in 1987 and the M.S. and Ph.D. degrees

from Texas A&M University, College Station, in

1996 and 2001, respectively.

Currently, he is in charge of the New Genera-

tion Image Research Team, ETRI, Daejeon, Korea.

His research interests include digital holography,

virtual reality, augmented reality, haptic interac-

tion, physically based dynamic simulation, and

robotics.

Sung-Kyu Kim received the B.S., M.S., and Ph.D.

degrees from the Quantum Optics Group of

Physics, Korea University, Seoul, Korea, in 1989,

1991, and 2000, respectively.

Then, he spent two years as an Invited

Research Acientist at the 3-D TV Group of Tele-

communications Advancement Organization in

Japan. In 2001, he was appointed Senior Research

Scientist at the Korea Institute of Science and

Technology. His research interests include optical

design of 3-D display systems, super multiview display, multiview image

processing, digital holography, holographic optical elements, and multi-

focus 3-D display.

Son et al. : Three-Dimensional Imaging for Creating Real-World-Like Environments

204 Proceedings of the IEEE | Vol. 101, No. 1, January 2013

Page 16: Three-Dimensional Imaging for Creating Real-World-Like Environments

Kwang-Hoon Lee received the B.S. and M.S.

degrees from the Department of Physics, Soon-

chunhyang University, Asan, Korea, in 2000 and

2002, respectively, and the Ph.D degree from the

Department of Advanced Technology Fusion,

Konkuk University, Korea, in 2012.

Then he spent two years as a researcher at the

Material Device and MEMS division, Samsung

Advanced Institute of Technology, Korea. Since

2005, he has been working in 3-D field at Korea

Institute of Science and Technology. His research interests include

optical design of 3-D display systems, super multiview display, multiview

image processing, digital holography, holographic optical elements, and

human factors in depth recognition.

Bahram Javidi (Fellow, IEEE) received the B.S.

degree from George Washington University, Wa-

shington, DC, and the M.S. and Ph.D. degrees from

the Pennsylvania State University, University Park,

all in electrical engineering.

He is the Board of Trustees Distinguished

Professor at the University of Connecticut. He has

over 730 publications, including over 310 peer

reviewed journal article, over 360 conference

proceedings, including over 110 Plenary Addresses,

Keynote Addresses, and invited conference papers. His papers have been

cited over 9000 times according to the citation index of WEB of Science

(h-index=52). He is a coauthor on eight best paper awards.

Dr. Javidi is Fellow of seven national and international professional

scientific societies, including IEEE, the American Institute for Medical and

Biological Engineering (AIMBE), Optical Society of America (OSA), and

SPIE. In 2010, he was the recipient of The George Washington University’s

Distinguished Alumni Scholar Award, University’s highest honor for its

alumni in all disciplines. In 2008, he received a Fellow award by John

Simon Guggenheim Foundation. He is a coauthor on nine best journal and

conference paper awards. He received the 2008 IEEE Donald G. Fink

prized paper award among all (over 130) IEEE Transactions, Journals, and

Magazines. In 2007, The Alexander von Humboldt Foundation awarded

him with Humboldt Prize for outstanding U.S. scientists. He received the

Technology Achievement Award from the International Society for

Optical Engineering (SPIE) in 2008. In 2005, he received the Dennis

Gabor Award in Diffractive Wave Technologies from the SPIE. He was the

recipient of the IEEE Lasers and Electro-optics Society (IEEE Photonics)

Distinguished Lecturer Award twice in 2003–2004 and 2004–2005. He

was awarded the IEEE Best Journal Paper Award from the IEEE

TRANSACTIONS ON VEHICULAR TECHNOLOGY twice in 2002 and 2005. Early in

his career, the National Science Foundation named him a Presidential

Young Investigator and he received The Engineering Foundation and the

IEEE Faculty Initiation Award. He was selected in 2003 as one of the

nation’s top 160 engineers between the ages of 30 and 45 by the National

Academy of Engineering (NAE) to be an invited speaker at The Frontiers

of Engineering Conference which was cosponsored by The Alexander von

Humboldt Foundation. He is an alumnus of the Frontiers of Engineering

of The National Academy of Engineering since 2003. He is on the Editorial

Board of the PROCEEDINGS OF THE IEEE, and he was on the board of IEEE/

OSA JOURNAL OF DISPLAY TECHNOLOGY.

Son et al.: Three-Dimensional Imaging for Creating Real-World-Like Environments

Vol. 101, No. 1, January 2013 | Proceedings of the IEEE 205