Download pdf - Augmented Reality - cs-courses.mines.educs-courses.mines.edu/csci507/projects/2013/French_Augmented Rea… · What is Augmented Reality (AR)? A live view of a real, physical environment

Augmented RealityUsing Pose Estimation on the Android Mobile Platform

Joshua French

What is Augmented Reality (AR)?

A live view of a real, physical environment whose elements are augmented by computer-generated sensory input

Sensory input could be graphics, video, text, audio…

Combined with sensors and ideas from computer vision allows augmentation to become interactive

Devices include head-mounted displays, projectors, glasses, televisions, hand-held mobile devices

Android

Open-source, Linux-based OS for mobile devices such as tablets and cell phones

Most devices have many sensors such as cameras, touch screens, accelerometers, GPS, and others for adding interactivity to an AR application

Project –Rendering a Virtual Object In a Scene

Pose Estimation - Two methods:

Positioning sensors (GPS, accelerometers, magnetic/gravitational sensors, …)

2D-3D correspondences

My Project:

Camera and 2D-3D correspondences to pose approach

Pose Estimation Using Correspondences

2 main steps to the algorithms:

Determine 2D-3D correspondences

Calculate pose from correspondences

2D-3D Correspondences

Fiducials – Objects placed in a field of view used as a point of reference:

Markers

CCCs

QR codes

AprilTags

ARToolkit markers

Environment features

Example of Detecting Fiducial Marker:AprilTag

Procedure:

1) Detect line segments

2) Find 4 corners of a quad formed by four sequential line segments

3) Compute tag pose using four-point homography estimation

4) Decode tag

1 – Computes gradient magnitude and direction at each pixel. Pixels with similar gradients are clustered into components. Line segments are then fit to the components using least squares.

2 – DFS of depth four is used to find quads obeying CCW winding order

4 – Using the calculated homography, tag-relative coordinates of each bit field are transformed into image coordinates and the resulting pixels are thresholded to determine if the bit is black or white

Factors affecting accuracy

Environmental illumination

Tag non-planarity (bending)

Lens distortion

Angle of tag

Distance to tag

Calculating Pose

Least Squares

Direct Linear Transform (DLT)

PosIt

Nestor

SLAM/CLAM

Least Squares

Find x (pose) to minimize error (E = |f(x) – y0|2)

Algorithm:

1) Guess pose (x = x0)

2) Compute predicted image points (y = f(x); Residual error is dy = y – y0)

3) Calculate Jacobian (J = [df / dx])

4) With dy = Jdx, solve for dx using pseudo inverse (dx = (JTJ)-1JTdy

5) Set x <= x + dx

6) Repeat steps 2-4 until convergence

Direct Linear Transform

PosIt

Two part algorithm:

POS (Pose from Orthography and Scaling)

Approximates perspective projection and finds rotation matrix and translation vector

IT (Iteratively)

Iteratively uses a scale factor on each point to enhance orthographic projection until a threshold is met

POS

PosIt

AndARhttps://code.google.com/p/andar/