Augmented RealityUsing Pose Estimation on the Android Mobile Platform
Joshua French
What is Augmented Reality (AR)?
A live view of a real, physical environment whose elements are augmented by computer-generated sensory input
Sensory input could be graphics, video, text, audio…
Combined with sensors and ideas from computer vision allows augmentation to become interactive
Devices include head-mounted displays, projectors, glasses, televisions, hand-held mobile devices
Android
Open-source, Linux-based OS for mobile devices such as tablets and cell phones
Most devices have many sensors such as cameras, touch screens, accelerometers, GPS, and others for adding interactivity to an AR application
Project –Rendering a Virtual Object In a Scene
Pose Estimation - Two methods:
Positioning sensors (GPS, accelerometers, magnetic/gravitational sensors, …)
2D-3D correspondences
My Project:
Camera and 2D-3D correspondences to pose approach
Pose Estimation Using Correspondences
2 main steps to the algorithms:
Determine 2D-3D correspondences
Calculate pose from correspondences
2D-3D Correspondences
Fiducials – Objects placed in a field of view used as a point of reference:
Markers
CCCs
QR codes
AprilTags
ARToolkit markers
Environment features
Example of Detecting Fiducial Marker:AprilTag
Procedure:
1) Detect line segments
2) Find 4 corners of a quad formed by four sequential line segments
3) Compute tag pose using four-point homography estimation
4) Decode tag
1 – Computes gradient magnitude and direction at each pixel. Pixels with similar gradients are clustered into components. Line segments are then fit to the components using least squares.
2 – DFS of depth four is used to find quads obeying CCW winding order
4 – Using the calculated homography, tag-relative coordinates of each bit field are transformed into image coordinates and the resulting pixels are thresholded to determine if the bit is black or white
Factors affecting accuracy
Environmental illumination
Tag non-planarity (bending)
Lens distortion
Angle of tag
Distance to tag
Calculating Pose
Least Squares
Direct Linear Transform (DLT)
PosIt
Nestor
SLAM/CLAM
Least Squares
Find x (pose) to minimize error (E = |f(x) – y0|2)
Algorithm:
1) Guess pose (x = x0)
2) Compute predicted image points (y = f(x); Residual error is dy = y – y0)
3) Calculate Jacobian (J = [df / dx])
4) With dy = Jdx, solve for dx using pseudo inverse (dx = (JTJ)-1JTdy
5) Set x <= x + dx
6) Repeat steps 2-4 until convergence
Direct Linear Transform
PosIt
Two part algorithm:
POS (Pose from Orthography and Scaling)
Approximates perspective projection and finds rotation matrix and translation vector
IT (Iteratively)
Iteratively uses a scale factor on each point to enhance orthographic projection until a threshold is met
POS
PosIt
AndARhttps://code.google.com/p/andar/