3D Organ Modeling - Clarkson Universitysonarav/Files/MThesis.pdf · element method) on the 3D model of the organ, to estimate the distributions of stress in the walls of different

Clarkson University

3D Organ Modeling

A thesis by

AJAY V. SONAR

Department of Electrical and Computer Engineering

Submitted in fulfillment of the requirements

for the degree of

MASTERS OF SCIENCE

(ELECTRICAL ENGINEERING)

DATE

Accepted by the Graduate School

Date Dean

The undersigned have examined the thesis entitled:

3D Organ Modeling

presented by Ajay Sonar, a candidate for the degree of Master of Science

and hereby certify that it is worthy of acceptance.

__________ ________________________ Date Advisor Dr. James J. Carroll Associate Professor Electrical and Computer Engineering Department Examining Committee _________________________ Dr. Sunil Kumar Assistant Professor Electrical and Computer Engineering Department _________________________ Dr. Robert J. Schilling Professor Electrical and Computer Engineering Department

i

Abstract A lot of research has been going on in the field on 3D modeling in the recent

years. Image based reconstruction from multiple views is a challenging problem. It has

application in various fields. One of them is the medical field. 3D models of the body

organs/parts assist in understanding the mechanism in a much better way than the

conventional 2D MRI or CT scan images or by gross pathologic examination. In this

ongoing study a biomechanical approach in understanding the mechanisms involved in

Abdominal Aortic Aneurysm (AAA) pathogenesis to help improve the ability to identify

those that have a high risk of rapture and hence aid clinical management is being tried.

Rapture of AAA is the 13th leading cause of death in the United States. AAA is a disease

that affects the large blood vessel, abdominal aorta, in the abdomen. In some patients,

these vessels start to bloat and will keep on bloating until they are either surgically

repaired (by implanting an artificial tube in its place) or until they rupture. The project

ultimately has 2 goals: 1) understand AAA disease and 2) develop ways to predict when

an AAA will rupture. This is possible by applying the numerical techniques (e.g., finite

element method) on the 3D model of the organ, to estimate the distributions of stress in

the walls of different aneurysms. These kind of studies can aid vastly in understanding

how the disease progresses.

ii

Contents 1. INTRODUCTION

1.1. Overview

1.2. Related Work

2. CAMERA CALIBRATION

2.1. Introduction to Camera Calibration

2.2. Parameters

2.2.1. Intrinsic Parameters

2.2.2. Extrinsic Parameters

2.3. Calibration Steps

2.4. Setup

3. IMAGE ACQUISITION

3.1. Background Subtraction and Silhouette Extraction

4. 3D RECONSTRUCTION

4.1. Introduction to Voxel Carving

4.2. Voxel Carving by Silhouette Extraction

4.3. Voxel Carving by Coloring

4.3.1. Color Invariants

4.3.2. Ordinal Visibility Constraint

4.3.3. Voxel Coloring by Layered Scene Decomposition

4.3.4. Single Pass Algorithm

4.4. Surface Reconstruction

4.4.1. Problems Associated with Marching Cube Algorithm

4.4.1.1. Ambiguous Faces

4.4.1.2. Internal Ambiguities

4.4.2. Resolving Ambiguities

4.4.2.1. Resolving Ambiguities on the Face

4.4.2.2. Resolving Internal Ambiguities

iii

5. RESULTS

5.1. On the Phantom Model

5.1.1. Reconstruction Accuracy

5.2. On the Actual Specimen

6. CONCLUSION AND FUTURE WORK

6.1. Ways to Improve Results

6.1.1. Extended Lookup Table for the Marching Cube Algorithm

6.1.2. Voronoi-based Surface Reconstruction

6.2. Other Applications

REFERENCES

APPENDIX 1

APPENDIX 2

APPENDIX 3

iv

List of figures Figure 1: Calibration grid of one of the calibration images

Figure 2: The Setup

Figure 3(a): Reference Background

Figure 3(b): Image of the Object

Figure 3(c): Image after Background Subtraction

Figure 3(d): Image after Thresholding

Figure 4: Reconstruction from three views

Figure 5: The Voxel space.

Figure 6: Projecting a voxel on to the camera image plane

Figure 7: Bounding Box

Figure 8: Projection of voxels on the silhouette on one of the camera views

Figure 9: Carved Voxels after silhouette intersection test

Figure 10: Voxel Coloring. Given a set of basis images and a grid of voxels, color values

to voxels have to be assigned in a way that is consistent with all images

Figure 11: Example of Spatial ambiguity. Both voxel colorings appear identical from

these two viewpoints, despite having no colored voxels in common

Figure 12: Example of Color ambiguity. Both voxel colorings appear identical from these

two viewpoints. But the second row, center voxel has different color assignment in the

two scenes

Figure 13: Each of the six voxels has the same color in every consistent scene in which it

is contained. The collection of all such color invariants forms a consistent voxel coloring

denoted by S

Figure 14: Compatible camera configurations. (a) An overhead inward-facing camera

moving 360 degrees around the object. (b) An array of outward facing cameras.

Figure 15: 2D Layered scene traversal. Voxels can be partitioned into a series of layers of

increasing distance from the camera volume

Figure 16: 3D Layered Scene Traversal. Starts with L0 through L3

Figure 17: Result of voxel coloring algorithm alone

Figure 18: Final carved voxels

v

Figure 19: Triangulate Voxels

Figure 20: Indexing convention of the vertices and edges of a voxel

Figure 21: Vertex 1 is inside the surface and the rest of the vertices are outside the

surface

Figure 22: Final Surface Reconstructed model

Figure 23: Formation of hole in the surfaces Figure 24: Two different configuration of triangulation with the same set of intersection

point

Figure 25: Extended Lookup Table

Figure 26: Resolving the ambiguity on a face

Figure 27: Two configurations of case 4

Figure 28(a-d): Four Different Views of the Phantom Model

Figure 29(a-c): Three Different Views of the Actual Specimen

Figure 30: Projection of the Carved Voxels on the Silhouettes

Figure 31: 3D point-cloud obtained from the Marching Cube algorithm

1

1.INTRODUCTION 1.1.Overview

The problem of acquiring 3D model from a set of input images is a challenging

task. Due to new graphics oriented applications like tele-presence, virtual walkthroughs

and virtual view synthesis it has grabbed attention in the computer vision community.

Different approaches have been adapted, reconstruction from stereo images [2], [3], [4],

[5], or from multiple images from a single camera [6], [7], [14] to achieve this depending

on the applications. In this project a combination of voxel carving from silhouettes and

voxel coloring method [1], [6] in reconstructing 3D model is used by taking multiple

images of the phantom model of the stomach affected by Aortic Aneurysm. The

foreground, which contains the image of the organ, is separated from the background by

calculating an appropriate background model. A voxel model is generated from those

images and a triangulated surface model is generated from the voxel model. Further

Finite Element Analysis has to be done on the model to study the mechanisms involved

in the AAA disease.

2

1.2.Related work Different methods have been tried to reconstruct 3D graphical model of real

objects. Reconstruction from the silhouette based volume intersection is one of the

methods [7] [14]. The silhouette based method constructs the approximate visual hull of

the object. Some excess volume is produced in this approximated visual hull due to the

concavities existing on the object and the insufficient camera viewing angles. Rather than

using the binary silhouette images shape from photo consistency employs the additional

photometric (color) information [1], [6]. The method is similar to Space-Sweep approach

presented in [8] and [9] which performs an analogous scene traversal. In [9] a plane is

swept through the scene volume and votes are accumulated for points on the plane that

project to edge features in the images. Scene features are identified by modeling the

statistical likelihood of accidental accumulation and thresholding the votes to achieve a

desired false positive rate. This approach is useful in the case of limited occlusions, but

does not provide a general solution to the visibility problem.

A similar edge-based voting technique that uses linear subspace intersections

instead of a plane sweep to obtain feature correspondences is described in [28]. In this

approach, each point or line feature in an image “votes” for the scene subspace that

projects to that feature. Votes are accumulated when two or more subspaces intersect,

indicating the possible presence of a point or line feature in the scene. A restriction of this

technique is that it detects correspondences only for features that appear in all input

images.

In [10] and [11] a dome of cameras were built to capture real world dynamic

scenes. The intensity images and depth maps from each of the camera view at each time

instant are combined to form a Visible Surface Model (VSM) using Multibaseline Stereo

Algorithm. VSM encodes the structure of the scene visible to a camera (view dependent).

By merging the depth maps from different cameras in a common volumetric space a

Complete Surface Model (view independent) is generated. Generating CSM from VSM

works well when VSM’s to be merged are individually accurate. A disadvantage is that

original images are not used in the merging process so it is difficult to assess the photo

integrity of the reconstruction.

3

2. CAMERA CALIBRATION 2.1. Introduction to Camera Calibration

The objective of the camera calibration is to determine a set of camera parameters

that describe the mapping between 3D reference coordinates and 2D image coordinates.

Camera calibration in the context of 3-Dimensional machine vision is the process of

determining the internal camera geometric and optical characteristics (internal

parameters) and the 3D position and orientation of the camera reference frame relative to

certain world coordinate system (external parameters). The overall performance of the

system strongly depends on the accuracy of the camera calibration.

The pinhole camera model is used through the calibration procedure. This model

is based on the principle of co-linearity, where each point in the object space is projected

by a straight line though the projection center into the image plane. The pinhole model is

only an approximation of the real camera projection. It is not valid when high accuracy is

required and therefore a more comprehensive camera model must be used. The pinhole

model is a basis that is extended with some corrections for the distorted image

coordinates. The most commonly used correction is for the radial lens distortions that

cause the actual image to be displaced radically in the image plane. Also the centers of

curvature of lens surfaces are not always strictly collinear which introduces another

common distortion type, de-centering distortion which has both a radial and tangential

components. A proper camera model for accurate calibration can be derived by

combining the pinhole model with the correction for the radial and tangential distortion

components [12, 13].

The calibration is done in two steps: initialization and then nonlinear

optimization. The initialization step computes a close-form solution for the calibration

parameters not including any lens distortion. The non-linear optimization step minimizes

the total reprojection error over all the calibration parameters. The optimization is done

by iterative gradient descent with an explicit computation of Jacobian matrix.

4

2.2. Parameters 2.2.1. Intrinsic Parameters

The internal camera model is very similar to that used by Heikkil [13]. The

internal parametes of the camera are:

• Focal length (fc): focal length in pixels

• Principal point (cc): the center of camera image plane in pixels

• Skew coefficient (alpha_c): angle between the x and y pixel axis

• Distortions (kc): the image distortion coefficients (radial & tangential distortions)

Definition of the intrinsic parameters:

Let P be a point in space of coordinate vector ];;[ CCCC ZYXXX = in the

camera reference frame. Projection of the point P on the image plane according to the

intrinsic parameters will be (correct the sentence)

Let xn be the normalized pinhole image projection:

⎥⎦

⎤⎢⎣

⎡=⎥⎦

⎤⎢⎣⎡=

yx

ZcYcZcXcxn

// - (1)

Let 222 yxr += - (2)

After including the lens distortion, the new normalized point coordinate xd is

defined as follows:

dxxrkcrkcrkcxx

x nd

dd ++++=⎥

⎦

⎤⎢⎣

⎡= ))5()2()1(1(

)2()1( 642 - (3)

Where, dx is the tangential distortion vector:

⎥⎥⎦

⎤

⎢⎢⎣

⎡

++

++=

xykcyrkcxrkcxykc

dx)4(2)2)(3(

)2)(4()3(222

22

- (4)

Once the distortion is applied, the final pixel coordinates ];[_ ypxppixelx = and

the normalized coordinate vector xd are related to each other through the linear equation:

⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡=

⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡

1)1()1(

1d

d

p

p

yx

KKy

x

- (5)

5

Where KK is known as the camera matrix:

⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡=

100)2()2(0)1()1(*_)1(

ccfcccfccalphafc

KK - (6)

2.2.2. Extrinsic Parameters The extrinsic parameters are the rotation and translation matrices. Consider the

calibration grid of one of the calibration image.

O

XY

Z

Image points (+) and reprojected grid points (o)

100 200 300 400 500 600

50

100

150

200

250

300

350

400

450

Figure 1: Calibration grid of one of the calibration images

Let P be a point in space of coordinate vector ];;[ ZYXXX = in the grid

reference frame (Figure 1). Let ];;[ CCCC ZYXXX = be the coordinate vector of P in the

camera reference frame. Then XX and cXX are related to each other through the

following rigid motion equation:

ccc TXXRXX += * - (7)

The translation vector CT is the coordinate vector of the origin of the grid pattern

(O) in the camera reference frame, and the third column of the matrix CR is the surface

normal vector of the plane containing the planar grid in the camera reference frame.

6

2.3. Calibration Steps For the calibration process the Camera Calibration Toolbox for Matlab developed

by the Vision group at CALTECH [12] is used. A checker board pattern is used to

calculate the intrinsic and extrinsic parameters of the camera (see Appendix 3). The

calibration is done in two steps, initialization and nonlinear optimization. The

initialization step computes a closed form solution for the calibration parameters which

do not include any lens distortion. The nonlinear optimization step minimizes the total

reprojection error over all the calibration parameters. The optimization is done by

iterative gradient descent

2.4. Setup Scorpion B&W CCD camera having a resolution of 640 by 480 is used to capture

the images. The camera has a 12 pin GPIO interface which can be used to trigger the

camera to capture images (see Appendix 1). An RT-12 indexed rotary positioning table

is used to capture the images of the object from different angles. A programmable

controller is used to drive the Mdrive 23 stepper motor which controls the position of the

rotary table in precise steps (see Appendix 2). The trigger signal on the GPIO pin can be

synchronized with the position of the turntable such that the camera is triggered when the

turntable rotates by certain fixed step angles and stops. By doing so we can be sure that

the turntable is in exactly the same position, while capturing the image of the object and

while capturing the image of the checker board pattern to calculate the extrinsic

parameters. Figure 2 shows the setup.

Figure 2: The Setup

7

3. IMAGE ACQUISITION Once all the required calibration parameters are acquired, the next step is to take

images of the object to be modeled. The object space is illuminated by a set of lights and

is assumed to be approximately Lambertian scene with constant illumination. A set of 20

images are captured without the object of interest in the space by a camera (in this case a

Scorpion B&W CCD camera having a resolution of 640 by 480 was used). The average

of these is taken to remove any salt and pepper noise that might be introduced in the

image acquisition process. This image is used as a reference background.

A programmable controller is used to drive a motor which controls the indexed

turntable in precise steps, e.g., 10 degrees. The object to be modeled is placed at the

center of the turntable. Images of this object are captured at the predefined intervals, e.g.,

36 images for 10 degree steps.

3.1. Background Subtraction and Silhouette Extraction

The problem of extracting an object from an image or a video sequence is a

fundamental and crucial problem of many vision systems that include video surveillance,

object detection and tracking or human-machine interface. Typically, the approach for

discriminating objects from the background scene is background subtraction. The idea of

background subtraction is to subtract the current image from a reference image, which is

acquired from a static background during a period of time. The subtraction leaves only

non-stationary or new objects, which include the objects entire silhouette region. This

technique has been used for several years in many vision systems as a preprocessing step

for object detection [15].

One problem of generating silhouette by background subtraction is to remove

shadows effectively. In our setup the object space is illuminated by a set of lights which

would result is an approximately Lambertian space that is each point in the scene has

approximately constant illumination. Hence there are no shadows cast by the object. An

appropriate threshold is chosen and the images are segmented into background and

foreground.

8

A background model is created by averaging 20 frames taken at regular interval

as shown in Figure 3(a). This is done to cancel the effects of ambient light flickering and

salt and pepper noise that may be introduced in the images. Figure 3(b) shows the image

of the object is placed at the center of turntable. The background subtraction is performed

to extract the object from the entire scene. The extracted object is shown in Figure 3(c).

This image is thresholded to get a binary image. An appropriate value for thresholding is

chosen by trial and error for one of the images. Since the light intensity remains almost

the same throughout the image acquisition process, the same value is used for the rest of

the images. The resulting image is the silhouette image as shown in Figure 3(d). A

silhouette image is a binary image, with the value at a point indicating whether or not the

visual ray from the optical center through that image point intersects an object surface in

the scene. Thus each pixel is either a silhouette point or a background point.

(a) Reference Background (b) Image of the Object

(c) Image after Background Subtraction (d) Image after Thresholding

Figure 3.

9

4. 3D RECONSTRUCTION 4.1. Introduction to Voxel Carving

Reconstructing a 3D shape using 2D silhouettes from multiple images is also

called voxel carving, volume intersection or shape from silhouettes. The intersection of

the cones associated with a set of cameras/camera views defines a volume of scene space

in which the object is guaranteed to lie. The volume only approximates the true 3D shape,

depending on the number of views, the position of the viewpoints, and the complexity of

the object. Since concave patches are not observable in any silhouette, a silhouette based

reconstruction encloses the true volume. Laurentini [16] characterized the best

approximation, obtainable by an infinite number of silhouettes captured from all

viewpoints outside the convex hull of the object, as the visual hull.

Many methods have been developed for constructing volumetric models from a

set of silhouette images [7, 16, 17, 18, 19, 21, 22]. Starting from a bounding volume that

is known to enclose the entire scene, the volume is discretized into voxels and the task is

to create a voxel occupancy description corresponding to the intersection of back-

projected silhouette cones. The main step in these algorithms is the intersection test.

Some methods back project the silhouettes, creating an explicit set of cones that are then

intersected either in 3D [18, 19], or in 3D after projecting voxels into the images [20, 21].

Alternatively, it can be determined whether each voxel is in the intersection by projecting

it into all or the images and test whether it is contained in every silhouette [22]

In practice only a finite number of silhouettes are combined to reconstruct the

scene, resulting in an approximation that includes the visual hull as well as other scene

points. Figure 4 shows an example of the volume reconstructed form 3 silhouettes. The

generalized cones associated with the three images result in a reconstruction that includes

the object (black), points in concavities that are not visible from any view points (brick

texture), and points that are not visible from any of the three given views (gray).

10

Figure 4: Reconstruction from three views

Shapes reconstructed form silhouettes have been used successfully in a variety of

applications, including virtual reality [23], real-time human motion modeling [24, 25] and

building an initial coarse scene model [26]. In applications such as real-time image based

rendering of dynamic scenes, where an explicit scene model is not essential intermediate

step, new views can be rendered directly using visual ray intersection test [27].

11

4.2. Voxel Carving by Silhouette Extraction The volume of interest is divided into 80*80*180 equal sized voxels, each of 1

mm cubes. Figure 5 illustrates the voxel space created.

Figure 5: The Voxel space

The voxels are projected on a particular image plane using the intrinsic and

extrinsic parameters of the camera calculated as described in section 2. The projection of

one voxel on a plane results in 8 points corresponding to the 8 vertices of the voxel cube.

A bounding box containing all the 8 points is calculated. Figure 6 illustrates the process

of projecting a particular voxel on to the camera image plane.

Figure 6: Projecting a voxel on to the camera image plane [7]

12

The following algorithm explains the process of projecting each of the voxels on a

camera image plane and calculating the bounding box associated with that voxel.

Internal Parameters: fc is the focal length of the camera, cc is the image plane center,

_alpha c is the skew coefficient, kc is a 1x5 matrix containing the radial and tangential

distortion coefficients.

External Parameters: _Tc ext is the translation matrix and _Rc ext is the rotational

matrix.

_ _ * _ _cam coordinates Rc ext voxel coordinates Tc ext= + - (8)

_ (1)_ (2)_ (3)

Xc cam coordinatesYc cam coordinatesZc cam coordinates

===

- (9)

_ ;Xc Ycnormalized projectionZc Zc

⎡ ⎤= ⎢ ⎥⎣ ⎦ - (10)

2 2

_ (1)_ (2)

x normalized projectiony normalized projection

r x y

==

= +

- (11)

2 2

2 2

[2* (3)* * (4)*( 2* );(3)*( 2* ) 2* (4)* * ]

dx kc x y kc r xkc r y kc x y

= + +

+ + - (12)

2 4_ (1 (1)* (2)* )*[ ; ]distortion coordinates kc r kc r x y dx= + + + - (13)

_ (1)_ (2)

xd distortion coordinatesyd distortion coordinates

==

- (14)

_ ( (1)*( _ * ) (1))_ ( (2)* (2))

pixel x round fc xd alpha c yd ccpixel y round fc yd cc

= + += +

- (15)

13

The above algorithm is implemented for each of the 8 vertices of all the

voxels. _pixel x and _pixel y contains the x and y coordinates of the projection of a

voxel’s vertex onto the image plane.

From the eight points that are obtained from the above algorithm, the maximum

and the minimum values of the x and y coordinates is calculated. From this a bounding

box, which is a square that contains all the 8 projected points is constructed.

The Figure 7 shows the bounding box calculated from the projected voxels on one

of the camera image planes from one particular view. The red dots are the 8 vertices of a

voxel and the blue square is the bounding box that encloses the projected voxel vertices

on the image plane.

352 354 356 358 360 362 364 366 368 370185

190

195

200

205

210

Figure 7: Bounding Box

Each voxel is classified as either or outside the silhouette by checking for

overlapping region of the bounding box associated with that voxel with the silhouette.

This is called the voxel intersection test. The voxels that are outside are discarded. Figure

8 shows the projection of three of the voxels on the silhouette of the object taken from a

zero degrees. One of them is completely inside the silhouette, second voxel is on the

surface and the 3rd voxel is completely outside. The first two voxels are kept and the third

voxel is discarded.

14

Figure 8: Projection of voxels on the silhouette on one of the camera views

This process is repeated for all the camera views. At the end the voxels that are

retained represent the visual hull of the object. The result is shown in Figure 9.

Figure 9: Carved Voxels after silhouette intersection test

4.3. Voxel Carving by Coloring As explained in section 4.1 the voxel carving algorithm generates only the visual

hull of the object. This is an approximation of the actual object. The algorithm cannot

handle any concavities that might be present on the surface of the object.

Surface Voxel

Inside Voxel

Outside Voxel

15

Scene reconstruction by voxel coloring [1] is another technique different from

other approaches in its ability to cope with large changes is visibility and occlusions.

Voxel coloring problem is to assign colors (radiances) to voxels in a 3D volume so as to

achieve consistency with a set of basis images as illustrated in Figure 10. It is assumed

that the scene is composed of approximately Lambertian surface under fixed illumination.

Under these conditions, the radiance at each point is isotropic and therefore be described

by a scalar value which is called color. A 3D scene S is represented as a set of opaque

Lambertian voxels, each of which occupies a finite homogeneous scene volume centered

at a point S∈V , and has an isotropic radiance ),( Scolor ν . It is assumed that the scene is

entirely contained within a known, finite bounding volume. The set of all voxels in the

bounding volume is referred to as the voxel space and denoted with the symbolν . An

image is specified by the set I of all its pixels, each centered at a point I∈p , and having

irradiance ),( Scolor p .

Figure 10: Voxel Coloring. Given a set of basis images and a grid of voxels, color values

to voxels have to be assigned in a way that is consistent with all images [1]

Given an image pixel I∈p and scene S , we refer to the voxel S∈V that is

visible in I and projects to p by )(pV S= . A scene S is said to be complete with respect to

a set of images if, for every image I and every pixel I∈p , there exists a voxel S∈V

16

such that )(pV S= . A complete scene is said to be consistent with a set of images if, for

every image I and every pixel I∈p

)),((),( SScolorIcolor pp = - (16)

If N denotes the set of all consistent scenes, then the voxel coloring problem can

be defined as:

• Given a set of basis images I0, …, In of a static Lambertian scene and a voxel

space ν , determine a subset ν⊂S and a coloring ),( Scolor V , such

that NS ∈ .

Two issues that have to be addressed in this case are:

• Uniqueness: Multiple voxel coloring may be consistent with a given set of

images

• Computation: How to compute voxel coloring from a set of input images

without combinatorial search

Consistent voxel coloring exists, corresponding to the set of points and colors on

surfaces of the true Lambertian scene. But the voxel coloring is rarely unique, given that

a set of images can be consistent with more than one 3D scene. By Spatial Ambiguity a

voxel contained in one scene may not be contained in another as illustrated in Figure 11.

And by Color Ambiguity a voxel may be contained in two consistent scenes, but have

different colors in each as illustrated in Figure 12. Hence additional constraints are

needed to make the problem well defined.

S S’

Figure 11: Example of Spatial ambiguity. Both voxel colorings appear identical from

these two viewpoints (S and S’), despite having no colored voxels in common [1]

17

S S’

Figure 12: Example of Color ambiguity. Both voxel colorings appear identical from these

two viewpoints (S and S’). But the second row, center voxel has different color

assignment in the two scenes [1]

4.3.1. Color Invariants The only way to recover intrinsic scene information is through invariants—

properties that are satisfied by every consistent scene. For instance, consider the set of

voxels that are contained in every consistent scene. Laurentini [29] described how these

invariants, called hard points, could be recovered by volume intersection from silhouette

images. Hard points provide absolute information about the true scene but are relatively

rare; some images may yield none. A more frequently occurring type of invariant is

related to color rather than shape. A voxel V is said to be color invariant with respect to a

set of images if:

• V is contained in a scene consistent with the images

• For every pair of consistent scenes S and 'S , 'SS ∩∈V

implies )',(),( ScolorScolor VV = .

Unlike shape invariance, color invariance does not require that a point be

contained in every consistent scene. As a result, color invariants are more prevalent than

hard points. The union of all color invariants itself yields a consistent scene, i.e., a

complete voxel coloring as illustrated in Figure 13.

18

Figure 13: Each of the six voxels has the same color in every consistent scene in which it

is contained. The collection of all such color invariants forms a consistent voxel coloring

denoted by S [1]

4.3.2. Ordinal Visibility Constraint

Color invariants are defined with respect to the combinatorial space N of all

consistent scenes. In order to compute the color invariants by a single pass through the

voxel space, it is necessary for the input camera configurations to satisfy the ordinal

visibility constraint which is stated as:

• Let P and Q are scene points in an image from a camera view then there

exists a real non-negative function ℜ⇒ℜ3:D such that for all scene points

P andQ , and input images I, P occludes Q in I only if )()( QDPD < .

Figure 14 shows the two possible camera configurations satisfying the constraint.

In our experiments the configuration shown in Figure 14 (a) is used.

19

(a) (b)

Figure 14: Compatible camera configurations. (a) An overhead inward-facing

camera moving 360 degrees around the object. (b) An array of outward facing

cameras [1]

4.3.3. Voxel Coloring by Layered Scene Decomposition The ordinal visibility constraint limits the possible camera view configurations,

but the visibility relationships are simplified. It becomes possible to partition the scene

into a series of voxel layers that obey a visibility relationship, i.e., for every input image,

voxels only occlude other voxels that are in subsequent layers. Hence the visibility

relationships are resolved by evaluating voxels one layer at a time.

To formulate the idea of visibility ordering, the following partition of the 3D

space into voxel layers of uniform distance from the camera volume is defined.

Ur

id

d

i

dVDV

1

})(|{

=

=

==

νν

ν

- (17)

Where, rdd ,...,1 is an increasing distance from the camera volume.

For the sake of illustration, consider a set of views positioned along a line facing a

two-dimensional scene as shown in Figure 15. Choosing D to be orthogonal distance to

the line gives rise to a series of parallel linear layers that move away from the cameras.

Notice that for any two voxels P andQ , P can occlude Q from a basis viewpoint only

20

if Q is in a higher layer than P . The linear case is easily generalized for any set of

cameras satisfying the ordinal visibility constraints.

Figure 15: 2D Layered scene traversal. Voxels can be partitioned into a series of layers of

increasing distance from the camera volume [1]

Decomposition of a 3D scene can be done in a similar manner. In the 3D case the

layers become surfaces that expand outward from the camera volume as shown in Figure

16.

-40-39

-38-37

-36-35

-40-39

-38-37

-36116

117

118

119

120

Figure 16: 3D Layered Scene Traversal. Starts with L0 through L3

L0

L3

d2 d3

21

To compensate for the effects of image quantization and noise, suppose that the

images are discretized on a grid of finite non-overlapping pixels. If a voxel V is not fully

occluded in image jI , its projection overlaps a nonempty set of image pixels, jπ .

Without noise or quantization effects, a consistent voxel should project to a set of pixels

with equal color values. In the presence of these effects, the correlation of Vλ of the pixel

colors is evaluated to measure the likelihood of voxel consistency. In this case, the value

of Vλ was chosen to be the standard deviation of the image pixels onto which the voxel V

projects.

4.3.4. Single Pass Algorithm In order to evaluate the consistency of a voxel, first the set of pixels jπ , that

overlap V’s projection in jI is calculated. Neglecting occlusions, it is straightforward to

compute a voxel’s image projection, based on the voxel’s shape and the known camera

configuration (intrinsic and extrinsic parameters). The term footprint [36] is used to

denote this projection, corresponding to the intersection with the image plane of all rays

from the camera center intersecting the voxel. Accounting for occlusions is more

difficult, and only those images and the pixel positions should be included from which V

is visible. This difficulty is resolved by using the ordinal visibility constraint to visit

voxels in an occlusion compatible order and marking pixels as they are accounted for.

Initially all pixels are unmarked. When a voxel is visited, jπ is defined to be the

set of unmarked pixels that overlap V’s footprint. When a voxel is evaluated and found to

be consistent, all pixels in jπ are marked. Because of the occlusion compatible order of

voxel evaluation, this strategy is sufficient to ensure that jπ contains only the pixels from

which each voxel is visible. By assumption voxels within layer do not occlude each other.

The complete voxel coloring algorithm can be stated as follows [1]:

22

The threshold corresponds to the maximum allowable correlation error. A very

small value will result in an accurate but incomplete reconstruction. On the other hand, a

large value yields a more complete reconstruction but includes some erroneous voxels.

Instead thresholding correlation error, it is possible to optimize for model completeness.

A completeness threshold may be chosen that specifies the minimum allowable

percentage of image pixels left unmarked. For instance, a completeness threshold of 75%

requires that at least 3/4th of the image pixels correspond to the projection of the colored

voxels.

23

The result of the voxel coloring algorithm alone applied on the voxel space to

carve out the object is shown below in Figure 17. The result shows some extra voxels that

are present in the final carved out voxels. This is because the algorithm assumes that the

scene space is composed of Lambertian surface under fixed illumination. This is the ideal

condition, and is not true in practical scenario as the surface of the object might reflect

light and the scene space might be illuminated by ambient light. The extra voxels are the

result of these reasons. Better results are achieved by first obtaining the convex hull of

the object by silhouette intersection and then applying the voxel coloring algorithm on

the voxel set representing the convex hull of the object. The Figure 18 shows the result of

carving by first applying the silhouette intersection test and then the voxel coloring

algorithm.

Figure 17: Result of voxel coloring

algorithm alone

Figure 18: Final carved

voxels

24

4.4. Surface Reconstruction The next step in the process is constructing a surface model out of the voxel

carved model generated from previous steps. Depending on the application several

approaches to the 3D surface generation problem have been proposed [30, 31, 32, 33].

The most popular approach to generate triangular surfaces, when a sampled scalar data is

structured on a cubical grid, is the Marching Cubes [30, 31].

Marching cube uses a divide and conquer approach to locate the surface in a

logical cube (voxel). The algorithm determines how the surface intersects this voxel, then

moves (or marches) to the next voxel. To find the surface intersection on the voxel, we

assign a value of one to the voxel’s vertex if the data value at that vertex exceeds (or

equals) the value of the surface we are constructing. These vertices are inside or on the

surface. Voxel vertices with values below the surface receive a value of zero and are

outside the surface. The surface intersects those voxel edges where one vertex is outside

the surface (one) and the other is inside the surface (zero). With this assumption, the

surface topology within a voxel is determined.

Since there are 8 vertices in each voxel and two states, inside and outside, there

are 28=256 ways a surface can intersect the cube. By enumerating these 256 cases, a

lookup table that stores the surface-edge intersection is created. The table contains the

edges intersected for each case.

Triangulating the 256 cases is possible but tedious and error prone. Two different

symmetries of the cube reduce the number of cased from 256 to 14 patterns. Figure 19

shows the triangulation of the 14 patterns. Permutation of these 14 basic patterns using

complementary and rotational symmetry produces the 256 cases.

25

Figure 19: Triangulate Voxels

The indexing convention used to number the edges and vertices in our algorithm

is shown below in Figure 20.

Figure 20: Indexing convention of the vertices and edges of a voxel

26

If vertex 1 is below or inside the isosurface, i.e. having a value of zero, and all

other vertices are above the isosurface, i.e. having a value of one, then we would create a

triangular surface that cuts the edges 1, 4 and 9 as shown in Figure 21. And the exact

position of the vertices of the triangular surface depends of the values at the vertices 1, 2,

4 and 5.

Figure 21: Vertex 1 is inside the surface and the rest of the vertices are outside the

surface.

Depending on the user specified threshold value the vertices of the voxel are

defined to be either inside or outside the surface. An 8-bit binary number, voxel index, is

generated according to that. For example if vertex 1 is inside the surface and the other

vertices are outside then the voxel index would be 11111110 where a value of zero

indicates that the vertex is inside and a value of indicates that the voxel is outside the

surface. The positions of the 1’s are the vertex numbers. Thus if vertices 1, 2, 4 and 8 are

inside then the voxel index would be 01110100. A look up table of the intersecting edges

is made. Given the voxel index, the corresponding entry in the edge table gives the edges

that will be intersected by the triangulated surface. For example, if the voxel index is

11111110 then the corresponding entry in the edge table is 000100001001. That is if the

vertex 1 is inside the surface then edges 1, 4 and 9 are intersected by the surface. From

this information, a triangulated surface model, according to the convention shown in

Figure 19 is generated.

The intersection points of the surface on the edges of the voxels can be calculated

by linear interpolation. If P1 and P2 are the vertices of a cut edge and V1 and V2 are the

scalar values at each of the vertices, the intersection point P is given by

Sruface

Vertex below the surface

27

)/())(( 121211 VVPPVthresholdPP −−−+= - (18)

The last part of the algorithm involves forming the correct facets from the

positions where the surface intersects the edges of the voxel. Again a table is used which

makes use of the same voxel index but allows the vertex sequence to be looked up for as

many triangular facets are necessary to represent the surface within the voxel. Figure 22

shows the final surface reconstructed model.

Figure 22: Final Surface Reconstructed model

28

4.4.1. Problems Associated with Marching Cube Algorithm The main problems, with the Marching Cubes are the ambiguities inherent to the

data sampling. Those ambiguities can appear on the face or inside a voxel and may lead

to small holes appearing in the reconstructed surface [34, 35].

4.4.1.1. Ambiguous Face The ambiguity arises when a face has two diagonally opposite vertices inside the surface

(with a value 0) and the other two diagonally opposite vertices outside the surface (with a

value 1). For ambiguous faces, the information on the vertices is insufficient to decide

how to connect the intersection point on the edges. One such example is shown in Figure

23. When two adjacent voxels, one, of the form of case 3 and the other of the form of

case 6 are joined, it forms a hole in the middle.

Figure 23: Formation of hole in the surfaces

4.4.1.2. Internal Ambiguities The same set of intersection points on the edges may lead to different configuration of

tiling. One such example is shown in Figure 24.

Figure 24: Two different configuration of triangulation with the same set of

intersection point

29

4.4.2. Resolving Ambiguities To resolve the ambiguities the basic lookup table is extended, as shown in Figure

25, with the additional cases and the correct topology is selected by solving for

ambiguities as explained in [35].

Figure 25: Extended Lookup Table

30

4.4.2.1. Resolving Ambiguities on the Face For each configuration in the lookup table the Marching Cube method used only

one isosurface topology, while a trilinear function often permits several different variants.

The trilinear function is given below.

)()1()1(

)1)(1(

)1()1()1()1)(1()1)(1)(1(),,(

111

011

101

001

110

010

100

000

qstFstqFtsqF

tsqF

tqsFtsqFtsqFtsqFtsqF

+−+−+

−−+

−+−−+−−+−−−=

- (19)

Where, q, s and t represent the local coordinates of the voxel varying from 0 to 1,

and F000 … F111 represent the values at the vertices of the voxel. The function F varies

binlinearly over a face or any plane parallel to a face. By fixing one of the variable, for

example making q=q0, the equation take the form

)()1()()1()()1()()1(

,

)1(()1()1)(1(),(

01110011

1010001

01100010

01000000

qFqFDqFqFCqFqFBqFqFA

where

DsttsCtBstsAtsF

o

+−=+−=+−=+−=

+−+−+−−=

- (20)

On a face A, B, C and D are equal to the values at the corners of the face. On an

ambiguous face let A and C be inside the surface and B and D be outside the surface. In

order to determine which nodes are joined, it is sufficient to compare the two products

AC and BD. So if AC>BD, then the nodes that are inside are joined and the outside

31

nodes are separated, otherwise the nodes that are outside are joined as shown in Figure

26.

Figure 26: Resolving the ambiguity on a face

4.4.2.2. Resolving Internal Ambiguities There are different methods of solving internal ambiguities. One of them is the

comparison of hyperbolas on the opposite faces of the voxel where the internal

ambiguities exist. If two areas of the same sign (inside of outside the surface) are joined

inside the voxel, then the projections of the hyperbolas must intersect each other. Figure

27 shows two different configurations of the case 4 are shown [34, 35].

Figure 27: Two configurations of case 4

32

5. RESULTS The reconstruction algorithm has been tried on both the phantom model and on

the actual specimen. Section 5.1 shows the results of the phantom model and section 5.2

shows the results on the actual specimen.

5.1. On the Phantom Model Figure 28 (a-d) shows the AAA phantom image (left), the voxel carved model

(middle) and the surface model (right).

(a)

(b)

33

(c)

(d)

Figure 28: Four Different Views of the Phantom Model

5.1.1. Reconstruction Accuracy Voxel Size (mm) Volume of the

Phantom Model (cc)

Calculated Volume

of the Voxel Model

(cc)

% Error

2mm x 2mm x 2mm ~180 191.48 6.37%

1mm x 1mm x 1mm ~180 182.773 1.54%

34

5.2. On the Actual Specimen Figure 29(a-c) shows the results of voxel carving on the actual specimen.

(a)

(b)

(c)

Figure 29: Three Different Views of the Actual Specimen

35

Accurate reconstruction was not achieved on these images because of the errors in

the calibration parameters that were induced by the movement of the specimen with

respect to the calibration grid in the sequence of images taken.

The projection of the final carved voxels on the object silhouettes are shown in

Figure 30. These images show the error in reconstruction.

Figure 30: Projection of the Carved Voxels on the Silhouettes

36

6. CONCLUSION AND FUTURE WORK Surface model of the AAA model has been successfully reconstructed. The model

has to be refined to get a watertight surface model.

6.1. Ways to Improve Results Two different approaches can be taken to improve the results obtained.

6.1.1. Extended Lookup Table for the Marching Cube

Algorithm The holes on the surface could be patched by using an extended lookup table for

the marching cube algorithm as explained in sections 4.4.1 and 4.4.2.

6.1.2. Voronoi-based Surface Reconstruction Another method of reconstructing a smooth surface from a finite set of

unorganized sample points by using 3-dimensional Voronoi diagrams [37]. The

algorithm is based on 3-dimensional Voronoi diagram and Delaunay triangulation. The

output of the Marching Cube algorithm is also a set of sample points. Figure 31 shows

the 3D point-cloud obtained from the Marching Cube algorithm. Thus the Voronoi based

surface reconstruction could be applied to such a dataset. A through literature survey has

yet to be done on this technique.

Figure 31: 3D point-cloud obtained from the Marching Cube algorithm

37

6.2. Other Applications Influence of Microstructure on Conditions for Vertebral Compression Fractures

The scope of this research is to examine the influence of microstructure in

creating conditions favorable for the occurrence of vertebral compression factures. The

investigation will be conducted using synthetic trabecular bone microstructure core

samples and synthetic vertebrae, manufactured from microCT scans of human trabecular

bone and vertebrae, using stereolithography rapid prototyping equipment.

The objectives are:

Develop a process for designing, manufacturing and testing synthetic vertebrae

and vertebral trabecular bone specimens, including development of a systematic,

repeatable process for creating osteoporosis-affected microstructure from healthy

microstructure.

Use the above processes to conduct a systematic study on the influence of

microstructural variability and deterioration on the mechanical response of

vertebrae and vertebral trabecular bone, with emphasis on quantifying and

understanding microstructural characteristics and deformation mechanisms that

lead to vertebral compression fractures.

Examine the applicability of a theoretical bifurcation approach to strain

localization, for predicting the occurrence of vertebral compression fractures.

One of the aspects of the project is to provide a 3-dimensional representation of

the microstructure from MicroCT scans. 3-dimensional surface reconstruction can be

achieved by building a voxel model and by applying the marching cube surface

triangulation algorithms [38].

38

References [1] Steven M. Seitz and Charles R. Dyer, “Photorealistic Scene Reconstruction by

Voxel Coloring”, Int. Journal of Computer Vision, Vol. 35, No. 2, pp.151-173,

1999.

[2] Paul E. Debevec, Camillo J. Taylor, and Jitendra Malik, “Modeling and rendering

architecture from photographs: A hybrid geometry and image based approach”, In

Proc. SIGGRAPH 96, pp. 11-20, 1996.

[3] P. J. Narayanan, Peter W. Rander, and Takeo Kanade, “Constructing virtual

worlds using dense stereo”, Proc, Sixth Int. Conf. on Computer Vision, pp. 3-10,

Jan 1998.

[4] Daniel Scharstein, “Stereo vision for view synthesis”, In Proc. Computer Vision

and Pattern Recognition Conf, pp. 852-858, 1996.

[5] H. Fuchs, G. Bishop, K. Arthur, L. McMillan, R. Bajcsy, S. Lee, H. Farid, and T.

Kanade, “Virtual Space Teleconferencing Using a Sea of Cameras”, Proc. First

Int. Conf. on Medical Robotics and Computer Assisted Surgery, June, 1994, pp.

161-167.

[6] W. Bruce Culbertson, Thomas Malzbender and Gregory G. Slabaugh,

“Generalized Voxel Coloring”, Proc. International Workshop on Vision

Algorithms, Sep 1999, pp. 100-115.

[7] Adem Yasar Mulayim, Ulas Yilmaz and Volkan Atalay, “Silhouette based 3D

Model Reconstruction from Multiple Images”, IEEE Transactions on Systems,

Man and Cybernetics, Part B.

[8] Robert T. Collins, “A Space-Sweep Approach to True Multi-image Matching”,

Proc. Computer Vision and Pattern Recognition Conf., pp. 358-363, 1997.

[9] Robert T. Collins, “Multi-image Focus of Attention for Rapid Site Model

Construction”, Proc. Computer Vision and Pattern Recognition Conf., pp. 575-

581, 1997.

[10] P. J. Narayanan, Peter W. Rander and Takeo Kanade, “Constructing Virtual

Worlds Using Dense Stereo”, Proc. Sixth IEEE Int. Conf. on Computer Vision.

pp. 3-10, 1998.

39

[11] Takeo Kanade, Peter Rander and P. J. Narayanan. “Virtualized Reality:

Constructing Virtual Worlds from Real Scenes”, IEEE Multimedia, 4(1): pp. 34-

46, 1997.

[12] Jean-Yves Bouguet, ‘Camera Calibration toolbox for Matlab’,

http://www.vision.caltech.edu/bouguetj/calib_doc/

[13] ‘A Four-step Camera Calibration Procedure with Implicit Image Calibration’ –

Heikkila.

[14] W. Niem, J. Wingbermuble, “Automatic reconstruction of 3D objects using a

mobile monoscopic camera”, In International Conf. on Recent Advances in 3D

Imaging and Modeling, May 1997, pp. 173-180.

[15] Horprasert, T., Harwood, D., and Davis, L.S. “A statistical approach for real-time

robust background subtraction and shadow detection”. In Proc. IEEE ICCV’99

FRAME-RATE Workshop, Kerkyra, Greece.

[16] A. Laurentini. “The visual hull concept for silhouette based image

understanding”, IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 16,

No. 2. pp. 150-162, 1994.

[17] A. Laurentini. “How far 3D shapes can be understood from 2D silhouettes”, IEEE

Trans. Pattern Analysis and Machine Intelligence, Vol. 17, No. 2, pp. 188-195,

1995.

[18] H. Noborio, S. Fukada, and S. Arimoto. “Construction of the octree

approximating three-dimensional objects by using multiple views”, IEEE Trans.

on Pattern Analysis and Machine Intelligence, 10(6):769-782, 1988.

[19] S. K. Srivastava and N. Ahuja. “Octree generation from object silhouettes in

perspective views”, Computer Vision, Graphics and Image Processing, 49:68-84,

1990.

[20] T. H. Hong and M. Shneier. “Describing a robot’s workspace using a sequence of

views from a moving camera”, IEEE Trans. Pattern Analysis and Machine

Intelligence, 7:721-726, 1985.

[21] M. Potmesil. “Generating octree models of 3D objects from their silhouettes in a

sequence of images”, Computer Vision, Graphics and Image Processing, 40:1-20,

1987.

40

[22] R. Szeliski. “Rapid octree construction from image sequences”, Computer Vision,

Graphics and Image Processing: Image Understanding, 58(1):23-32, 1993.

[23] 9 S. Moezzi, L-C. Tai, and P. Gerard. “Virtual view generation for 3D digital

video”, IEEE Multimedia, 4(1):18-26, 1997.

[24] G. K. M. Cheung, T. Kanade, J-Y. Bouguet, and M. Holler. “A real time system

for robust 3D voxel reconstruction of human motions”, In Proc. Computer Vision

and Pattern Recognition Conf. Vol 2. pp. 714-720, 2000.

[25] S. Moezzi, A. Katkere, D. Kuramura, and R. Jain. Reality modeling and

visualization from multiple video sequences. IEEE Computer Graphics and

Applications, 16(6):58–63, 1996.

[26] G. Cross and A. Zisserman. Surface reconstruction from multiple views using

apparent contours and surface texture. In A. Leonardis, F. Solina, and R. Bajcsy,

editors, Confluence of Computer Vision and Computer Graphics, pages 25–47.

Kluwer, 2000.

[27] W. Matusik, C. Buehler, R. Raskar, S. J. Gortler, and L. McMillan. Image-based

visual hulls. In Proc. SIGGRAPH 2000, pages 369–374, 2000.

[28] Steven M. Seitz and Charles R. Dyer. “Complete structure from four point

correspondences”. In Proc. Fifth Int. Conf. on Computer Vision, pages 330–337,

1995.

[29] Aldo Laurentini. How far 3D shapes can be understood from 2D silhouettes. IEEE

Trans. on Pattern Analysis andMachine Intelligence, 17(2):188–195, 1995.

[30] W. Lorensen, H. Cline. “Marching Cubes: A high resolution 3D surface

construction algorithm”, ACM Computer Graphics, Vol. 21, No. 4, pages: 163-

170, July 1987.

[31] C. Montani, R. Scateni, R. Scopigno. “Discretized Marching Cubes”,

Porceedings, Visualization 1994.

[32] B. Wunsche, J. Z. Lin. “An Efficient Topological Correct Polygonisation

Algorithm for Finite Element Data Sets”, Proc. Of IVCNZ, Nov. 2003.

[33] C-C. Ho, F-C. Wu, B-Y. Chen, Y-Y. Chuang, M. Ouhyoung. “Cubical Marching

Squares: Adaptive Feature Preserving Surface Extraction from Volume Data”,

EUROGRAPHICS, Vol. 24, Nov 2005.

41

[34] T. Lewiner, H. Lopes, A. W. Vieira and G. Tavares. “Efficient Implementation of

Marching Cubes’ cases with Topological Guarantees”, Journal of Graphics Tools,

Vol. 8, No. 2, pp. 1-15. 2003.

[35] E. V. Chernyaev. “Marching Cubes 33: Construction of topologically correct

Isosurfaces”, Technical Report CN/95-17, CERN, 1995

[36] Lee Westover. “Footprint evaluation for volume rendering”. In Proc. SIGGRAPH

90, pages 367–376, 1990.

[37] N. Amneta, M. Bern, M. Kamvysselis. “A New Voronoi-Based Surface

Reconstruction Algorithm”, Computer Graphics Proceeding, SIGGRAPH -98.

[38] R. Muller, T. Hildegrand and P. Ruegsegger. “Non-invasive bone biopsy: a new

method to analyse and display the three-dimensional structure of trabecular bone”,

Phys. Med. Biol. Vol. 39. 1994.

42

Appendix 1: Scorpion PGR 1394 Camera Model: SCOR-03NS

Figure 1: Picture of Scorpion Camera Module

Camera Specifications:

Table 1: Camera Specifications

43

Sensor Specifications:

Table 2: Sensor Specification

General Purpose Input/Output (GPIO) Pins: The Scorpion has a set of 8 GPIO pins that can be accessed by the Hirose HR10

(12 pin) external interface. These IO pins can be configured to accept an input signal to

externally trigger the camera or to send an output signal or strobe to an external device.

To configure the GPIO pins, consult the PGR IEEE-1394 Digital Camera Register

Reference.

GPIO Connector Pin Layout: The following diagram shows the pin layout for the Hirose HR10 12 pin female

circular connector (Manufacturer Part Number: HR10A-10R-12SB) used on all Scorpion

models. The male counter part Manufacturer Part Number is HR10A-10P-12P.

44

Figure 2: GPIO Pin Layout

GPIO Electrical Characteristics: The Scorpion GPIO pins are TTL 3.3V pins protected by two diodes to +3.3V and

GND in parallel. There is also a 10K resistor in series to limit current. When configured

as input, the pins can be directly driven from a 3.3V or 5V logic output. For output, each

GPIO pin has almost no drive strength (they are high impedance) and needs to be

buffered with a transistor or driver to lower its impedance. The IO pins are protected

from both over and under voltage.

45

Appendix 2: • Rotary Positioning Table

Figure 3: RT-12 Rotary Positioning Table

The RT-12 rotary positioning table can be used to position a variety of payloads

such as cameras or test fixtures. The 12” diameter aluminum top plate has 24 tapped

holes to attach the application. The RT-12 has a home switch that provides feedback to

the motion control system which tells the exact position of the table.

46

• Mdrive 23 Stepper Motor Mechanical Specifications:

Figure 4: Rotary MDrive23 Mechanical Specifications

Electrical Specifications:

Table 3: Electrical Specifications

47

Appendix 3: Camera Calibration Generate the calibration pattern

Generate a checker board pattern and paste it on a flat panel. Measure the X and

Y dimensions of the squares. To make these values as default the dX_default and

dY_default values in click_calib.m and click_calib_no_read.m (in the calib folder) can be

changed.

Camera Calibration Steps:

Start the main Matlab calibration function by typing calib_gui. All the calibration

images are loaded in to memory once and are not read from the disk again. Thus

increases the speed by reducing the number of disk access. But if the images are large or

the number of images is more Matlab may run out of memory. In such cases the

calibration toolbox has an option to run the calibration function in memory efficient

mode. In this mode the images are loaded one by one in to memory. But it takes more

time than the standard mode since the disk is accessed multiple times. The two modes of

operation are totally compatible and interchangeable.

The mode of operation can be specified at the Matlab command prompt as

calib_gui(0) for standard mode or calib_gui(1) for memory efficient mode.

Capture any number of images of the checker board pattern held in different

position and orientation and store them in a common folder. The image names must have

a common base name followed by numbers in sequence. A few of the images are shown

below. The image names are imgint1.jpg, imgint2.jpg, imgint3.jpg, imgint4.jpg and so

on.

48

Figure 5: Images of checker board pattern taken for camera calibration.

Corner extraction and calibration

Once the images are captured and stored start the calibration function by typing

calib_gui(1) at the Matlab command prompt. That will open a gui as shown below.

Figure 6: Matlab GUI for calibration.

Click on ‘Read Images’ button and when prompted enter the base name of the

image and the image format, in this case it would be ‘imgint’ and ‘jpg’ respectively.

Basename camera calibration images (without number nor suffix): imgint

Image format: ([]='r'='ras', 'b'='bmp', 't'='tif', 'p'='pgm', 'j'='jpg', 'm'='ppm') >> j

49

Checking directory content for the calibration images (no global image loading in

memory efficient mode)

Found images:

1...4...5...6...7...8...10...11...12...13...14...16...17...18...20...22...24...25...26...28...29...30...

done

To display the thumbnail images of all the calibration images, you may run

mosaic_no_read (may be slow)

Click on ‘Extract grid corners’ button. At the prompt hit enter without any

arguments to select all images. Choose the default window size for corner finder, i.e.

wintx=winty=5 by pressing enter without any arguments.

Extraction of the grid corners on the images

Number(s) of image(s) to process ([] = all images) =

Window size for corner finder (wintx and winty):

wintx ([] = 5) =

winty ([] = 5) =

Window size = 11x11

Do you want to use the automatic square counting mechanism (0=[]=default)

or do you always want to enter the number of squares manually (1,other)?

The corner extraction algorithm includes an automatic mechanism for counting

the number of squares in the grid. This is useful while working with large number of

images. In some cases the code may not predict the exact number of squares. This would

happen when calibrating lenses with extreme distortions. When prompted for automatic

square counting press enter to choose the options. At the end images with problems can

be reprocessed by manually counting the squares.

The images are then displayed on the screen for corner extraction. The first image

is shown below.

Processing image 1...

Loading image imgint1.jpg...

50

Using (wintx,winty)=(5,5) - Window size = 11x11 (Note: To reset the window size,

run script clearwin)

Click on the four extreme corners of the rectangular complete pattern (the first

clicked corner is the origin)...

Figure 7: first calibration image.

The first clicked point is associated with the origin of the reference frame attached

to the grid. The other three points can be selected in any order. Selecting the first point is

especially important while calibrating multiple cameras in space. While dealing with

multiple cameras, the same grid pattern reference frame needs to be selected for different

camera images, i.e. grid points need to correspond across all the different camera views.

An example of how to select the corner points is shown below.

51

Figure 8: extraction of corner

Once all the four corners are extracted, the algorithm prompts for the size of the

squares on the checkerboard pattern. The default value is 30mm. In this case the size is

30mm. so press enter without entering any arguments.

Size dX of each square along the X direction ([]=30mm) =

Size dY of each square along the Y direction ([]=30mm) =

The algorithm makes an initial guess of the corners and automatically counts the

number of squares in both the dimensions (shown in the figure below with red crosses)

52

Figure 9: initial guess for corner extraction.

If the predicted corners are close to the actual corners then the next step may be

skipped (initial guess for distortion). The corners are extracted using those positions as

the initial guess.

If the guessed grid corners (red crosses on the image) are not close to the actual

corners,

it is necessary to enter an initial guess for the radial distortion factor kc (useful for

subpixel detection)

Need of an initial guess for distortion? ([]=no, other=yes)

Corner extraction...

Done

The final extracted corners are shown below. The origin of the reference frame is

marked with ‘O’.

53

Figure 10: extracted corners.

Sometimes the predicted corners are not close enough to the actual corners to

allow for an effective corner extraction. In such cases it is necessary to refine the

predicted corners by entering a guess for lens distortion coefficient. An example is shown

below.

54

X

Y

O

The red crosses should be close to the image corners

100 200 300 400 500 600

50

100

150

200

250

300

350

400

450

Figure 11: The predicted Corners are not close to the real corners

If the predicted corners are far enough from the real grid corners, it might result

into wrong extractions. The main cause for this being image distortion. In order to help

the system to make better guess of the corner locations, the user can manually input a

guess for the first order lens distortion coefficient kc ( it is the first entry of the full

distortion coefficient vector kc). In order to input a guess for the lens distortion

coefficient, enter a non-empty string to the question Need of an initial guess for

distortion? (For example, 1). Enter then a distortion coefficient of kc=-0.3 (in practice,

this number is typically between -1 and 1).

If the guessed grid corners (red crosses on the image) are not close to the actual

corners,

it is necessary to enter an initial guess for the radial distortion factor kc (useful for

subpixel detection)

Need of an initial guess for distortion? ([]=no, other=yes) 1

Use number of iterations provided

Use focal provided

Estimated focal: 2727.4256 pixels

Guess for distortion factor kc ([]=0): -0.3

Satisfied with distortion? ([]=no, other=yes) 1

55

Corner extraction... The red crosses should be on the grid corners...

100 200 300 400 500 600

50

100

150

200

250

300

350

400

450

Figure 12: New predicted corners

OdX

dY

Xc (in camera frame)

Extracted corners

100 200 300 400 500 600

50

100

150

200

250

300

350

400

450

Figure 13: Final extracted corners

After corner extraction, the matlab data file calib_data.mat is automatically

generated. This file contains all the information gathered throughout the corner extraction

stage (image coordinates, corresponding 3D grid coordinates, grid sizes).

56

During calibrations, when there is a large amount of distortion in the image, the

program may not be able to automatically count the number of squares in the grid. In that

case, the number of squares in both X and Y directions have to be entered manually.

Main Calibration Step:

After the corner extraction, click on the Calibration button on the Camera

Calibration Tool, to run the main camera calibration procedure. Calibration is done in

two steps: initialization and nonlinear optimization. The initialization step computes a

closed form solution for the calibration parameters which do not include any lens

distortion. The nonlinear optimization step minimizes the total reprojection error over all

the calibration parameters. The optimization is done by iterative gradient descent with an

explicit computation of the Jacobian matrix.

Aspect ratio optimized (est_aspect_ratio = 1) -> both components of fc are estimated

(DEFAULT).

Principal point optimized (center_optim=1) - (DEFAULT). To reject principal

point, set center_optim=0

Skew not optimized (est_alpha=0) - (DEFAULT)

Distortion not fully estimated (defined by the variable est_dist):

Sixth order distortion not estimated (est_dist(5)=0) - (DEFAULT) .

Main calibration optimization procedure - Number of images: 22

Gradient descent iterations: 1…2…3…4…5…6…7…8…9…10…11...done

Estimation of uncertainties...done

Calibration results after optimization (with uncertainties):

Focal Length: fc = [ 1051.16340 1048.19912 ] ± [ 13.85102 13.49534 ]

Principal point: cc = [ 356.27700 237.85069 ] ± [ 20.24375 19.47471 ]

Skew: alpha_c = [0.00000] ± [ 0.00000 ] => angle of pixel axes = 90.00000 ±

0.00000 degrees

Distortion: kc = [ -0.30324 0.31284 -0.00228 0.00140 0.00000 ] ± [ 0.04881

0.33390 0.00318 0.00321 0.00000 ]

Pixel error: err = [ 0.27289 0.28010 ]

Note: The numerical errors are approximately three times the standard deviations

(for reference).

57

Observe that only 11 gradient descent iterations are required in order to reach the

minimum. This means only 11 evaluations of the reprojection function + Jacobian

computation and inversion. The reason for that fast convergence is the quality of the

initial guess for the parameters computed by the initialization procedure.

The 3D position of the calibration grid with respect to the camera (Figure 14) and

the position of the camera with respect to the grid reference frame (Figure 15) calculated

from the obtained parameters.

Figure 14: Position of the grid with respect to the camera

Figure 15: Position of the camera with respect to the grid reference frame

Documents

3D Organ Modeling - Clarkson Universitysonarav/Files/MThesis.pdf · element method) on the 3D model of the organ, to estimate the distributions of stress in the walls of different