Graphics Initiative Team, VPG, Intel Corporation · Visual Shopping Assistant (VSA) In Action Depth...

Preview:

Citation preview

International Workshop on OpenCL

12-13 May 2015

A Compute Model for Augmented Reality with Integrated-GPU AccelerationPreeti Bindu Jeremy Bottleson Sungye Kim Jingyi Jin

Graphics Initiative Team, VPG, Intel Corporation

Motivation AR Building Blocks

Realistic virtual object

rendering

Tracking sensor

movement

High quality data capture

through sensor

Seamless user experience

Components of a Typical AR AppAugmented Reality (AR) is a live view of real-worldsequences with enhanced digital information. Toenable such enhancement, typical modules of an ARapplication can be very compute intensive, which is afactor that prevents users from having smooth userexperience in low-end devices.

We present an AR application in which the performance is boosted by balancing heavy workload in both CPU and GPU:1) To capture high-quality data and process such

heavy data through color camera and depth sensors;

2) To localize or track sensor movement, which aligns real and virtual views when rendering two worlds;

3) To recover camera pose tracking even if that is lost momentarily;

4) To render virtual objects with photorealism to blend well into the real world;

5) To provide seamless user experience.

Silicon die layout for 4th Generation Intel® core processor, over half is dedicated to integrated GPU

Use Case

OpenCL Optimization

VSA Block Diagram

VSA Modules1) Imaging Device: Real-time RGB & Depth

camera sequence capture2) Depth Image Processing: Fill up “holes” in the

raw depth map3) Camera Tracking: Compute camera pose

based on depth image4) Render Engine/UI: Use RGB and depth images

along with camera pose to render real world with virtual 3D objects, Record/playback user interactions

Camera Tracking Algorithm

Pose Estimation

PreprocessRGB

Rendering

Composition

Feature Tracking

Feature MatchingFeature Extraction

100s to 1000s of objects * 1000

vectors per object

~(300-500 feature) x (64 dimensions/feature)

[0]

[1]

[2]

[n]

Several matching alg. based on NN, Binning, or optimized data structure

searches.

Rotation and translation matrices are estimated from uv-correspondences

Rendering 3D graphics(5-10 objects) x (1-10K poly/obj)

x (10-100 pixels/poly)

Input image frame preprocessed

Extractedfeaturesoverlay

Object Recognition

Depth Map

Ray casted Depth Map

Occlusion

Shadow Rendering

Measurement Tool

Self Shadow Rendering VSA

User Takes Measurements in

Real Time

User Inserts Virtual Object to

be Purchased

User Manipulates the Object in Real Time to Translate,

Rotate, etc.

An Online Purchase is Made

Real & Virtual Object

Comparison

Visual Shopping Assistant (VSA) In Action

Depth Map Processing1) Fill up “holes” in depth maps2) Canny edge detection to find

delimiters3) Nearest neighbor or

interpolation to fill in the missing pixels.

Tracking and Recovery1) Recover to get correct camera pose from lost tracking when the depth sensor

shakes/loses current depth frame2) Store keyframes consisting of feature descriptors extracted using BRISK or SURF run

on an RGB image and the corresponding camera pose3) Recover current camera pose by extracting features from the current frame and

matching them to the previously generated keyframes.4) Hamming distance computation for feature matching is implemented in OpenCL

Conclusion & Future Work

Platforms Running VSA

• Windows

• Android

HW Tested

• Intel Haswell on Surface Pro 2

• Intel Atom SoC on Bay Trail

Camera Technology

• Creative Gesture

• Prime Sense

• Kinect

- Incorporate OpenCL 2.0- Implement more depth processing and tracking algorithms

Depth Filtered

Depth Map

Vertex

Normal/

Pyramid

Vertex

Normal/

Pyramid

Volume

Data

Bilateral

Update Model

(Global TSDF)

Tracking (Pose

Estimation)

Including compute

linear systems

Depth 2

Vertex 2

Normal

Depth

Pyramid

Raycast

Depth Map

Pyramid

Pose