129
COSC 426: Augmented Reality Mark Billinghurst [email protected] Sept 19 th 2012 Lecture 9: AR Research Directions

426 Lecture 9: Research Directions in AR

Embed Size (px)

DESCRIPTION

The final lecture in the COSC 426 graduate course in Augmented Reality. Taught by Mark Billinghurst from the HIT Lab NZ at the University of Canterbury on Sept. 19th 2012

Citation preview

Page 1: 426 Lecture 9: Research Directions in AR

COSC 426: Augmented Reality

Mark Billinghurst

[email protected]

Sept 19th 2012

Lecture 9: AR Research Directions

Page 2: 426 Lecture 9: Research Directions in AR

Looking to the Future

Page 3: 426 Lecture 9: Research Directions in AR

The Future is with us It takes at least 20 years for new

technologies to go from the lab to the lounge..

“The technologies that will significantly affect our lives over the next 10 years have been around for a decade.

The future is with us. The trick is learning how to spot it. The commercialization of research, in other words, is far more about prospecting than alchemy.”

Bill Buxton

Oct 11th 2004

Page 4: 426 Lecture 9: Research Directions in AR

experiences

applications

tools

components

Sony CSL © 2004

Research Directions

Tracking, Display

Authoring

Interaction

Usability

Page 5: 426 Lecture 9: Research Directions in AR

Research Directions   Components

 Markerless tracking, hybrid tracking  Displays, input devices

  Tools   Authoring tools, user generated content

  Applications   Interaction techniques/metaphors

  Experiences  User evaluation, novel AR/MR experiences

Page 6: 426 Lecture 9: Research Directions in AR

HMD Design

Page 7: 426 Lecture 9: Research Directions in AR

Occlusion with See-through HMD   The Problem

  Occluding real objects with virtual   Occluding virtual objects with real

Real Scene Current See-through HMD

Page 8: 426 Lecture 9: Research Directions in AR

ELMO (Kiyokawa 2001)

  Occlusive see-through HMD   Masking LCD   Real time range finding

Page 9: 426 Lecture 9: Research Directions in AR

ELMO Demo

Page 10: 426 Lecture 9: Research Directions in AR

ELMO Design

  Use LCD mask to block real world   Depth sensing for occluding virtual images

Virtual images from LCD

Real World

Optical Combiner

LCD Mask Depth Sensing

Page 11: 426 Lecture 9: Research Directions in AR

ELMO Results

Page 12: 426 Lecture 9: Research Directions in AR

Future Displays

  Always on, unobtrusive

Page 13: 426 Lecture 9: Research Directions in AR

Google Glasses

Page 14: 426 Lecture 9: Research Directions in AR

Contact Lens Display   Babak Parviz

 University Washington   MEMS components

  Transparent elements  Micro-sensors

  Challenges  Miniaturization   Assembly   Eye-safe

Page 15: 426 Lecture 9: Research Directions in AR

Contact Lens Prototype

Page 16: 426 Lecture 9: Research Directions in AR

Applications

Page 17: 426 Lecture 9: Research Directions in AR

Interaction Techniques   Input techniques

  3D vs. 2D input   Pen/buttons/gestures

  Natural Interaction   Speech + gesture input

  Intelligent Interfaces   Artificial agents   Context sensing

Page 18: 426 Lecture 9: Research Directions in AR

Flexible Displays   Flexible Lens Surface

  Bimanual interaction   Digital paper analogy

Red Planet, 2000

Page 19: 426 Lecture 9: Research Directions in AR

Sony CSL © 2004

Page 20: 426 Lecture 9: Research Directions in AR

Sony CSL © 2004

Page 21: 426 Lecture 9: Research Directions in AR

Tangible User Interfaces (TUIs)   GUMMI bendable display prototype   Reproduced by permission of Sony CSL

Page 22: 426 Lecture 9: Research Directions in AR

Sony CSL © 2004

Page 23: 426 Lecture 9: Research Directions in AR

Sony CSL © 2004

Page 24: 426 Lecture 9: Research Directions in AR

Lucid Touch   Microsoft Research & Mitsubishi Electric Research Labs   Wigdor, D., Forlines, C., Baudisch, P., Barnwell, J., Shen, C.

LucidTouch: A See-Through Mobile Device In Proceedings of UIST 2007, Newport, Rhode Island, October 7-10, 2007, pp. 269–278.

Page 25: 426 Lecture 9: Research Directions in AR
Page 26: 426 Lecture 9: Research Directions in AR

Auditory Modalities

  Auditory   auditory icons   earcons   speech synthesis/recognition

  Nomadic Radio (Sawhney) -  combines spatialized audio -  auditory cues -  speech synthesis/recognition

Page 27: 426 Lecture 9: Research Directions in AR

Gestural interfaces   1. Micro-gestures

  (unistroke, smartPad)

  2. Device-based gestures   (tilt based examples)

  3. Embodied interaction   (eye toy)

Page 28: 426 Lecture 9: Research Directions in AR

Natural Gesture Interaction on Mobile

  Use mobile camera for hand tracking   Fingertip detection

Page 29: 426 Lecture 9: Research Directions in AR

Evaluation

  Gesture input more than twice as slow as touch   No difference in naturalness

Page 30: 426 Lecture 9: Research Directions in AR

Haptic Modalities

  Haptic interfaces   Simple uses in mobiles? (vibration instead of ringtone)   Sony’s Touchengine

-  physiological experiments show you can perceive two stimulus 5ms apart, and spaced as low as 0.2 microns

n層 28 µm

n層

4 µm

V

Page 31: 426 Lecture 9: Research Directions in AR

Haptic Input

  AR Haptic Workbench  CSIRO 2003 – Adcock et. al.

Page 32: 426 Lecture 9: Research Directions in AR

AR Haptic Interface

  Phantom, ARToolKit, Magellan

Page 33: 426 Lecture 9: Research Directions in AR

Natural Interaction

Page 34: 426 Lecture 9: Research Directions in AR

The Vision of AR

Page 35: 426 Lecture 9: Research Directions in AR

To Make the Vision Real..   Hardware/software requirements

 Contact lens displays   Free space hand/body tracking   Environment recognition   Speech/gesture recognition   Etc..

Page 36: 426 Lecture 9: Research Directions in AR

Natural Interaction   Automatically detecting real environment

  Environmental awareness   Physically based interaction

  Gesture Input   Free-hand interaction

  Multimodal Input   Speech and gesture interaction   Implicit rather than Explicit interaction

Page 37: 426 Lecture 9: Research Directions in AR

Environmental Awareness

Page 38: 426 Lecture 9: Research Directions in AR

AR MicroMachines   AR experience with environment awareness

and physically-based interaction   Based on MS Kinect RGB-D sensor

  Augmented environment supports   occlusion, shadows   physically-based interaction between real and

virtual objects

Page 39: 426 Lecture 9: Research Directions in AR

Operating Environment

Page 40: 426 Lecture 9: Research Directions in AR

Architecture   Our framework uses five libraries:

 OpenNI  OpenCV  OPIRA   Bullet Physics  OpenSceneGraph

Page 41: 426 Lecture 9: Research Directions in AR

System Flow   The system flow consists of three sections:

  Image Processing and Marker Tracking   Physics Simulation   Rendering

Page 42: 426 Lecture 9: Research Directions in AR

Physics Simulation

  Create virtual mesh over real world

  Update at 10 fps – can move real objects

  Use by physics engine for collision detection (virtual/real)

  Use by OpenScenegraph for occlusion and shadows

Page 43: 426 Lecture 9: Research Directions in AR

Rendering

Occlusion Shadows

Page 44: 426 Lecture 9: Research Directions in AR

Natural Gesture Interaction

Page 45: 426 Lecture 9: Research Directions in AR

Mo#va#on  AR  MicroMachines  and  PhobiAR  

•     Treated  the  environment  as            sta/c  –  no  tracking  

•     Tracked  objects  in  2D  

More  realis#c  interac#on  requires  3D  gesture  tracking      

Page 46: 426 Lecture 9: Research Directions in AR

Mo#va#on  Occlusion  Issues  

AR  MicroMachines  only  achieved  realis/c  occlusion  because  the  user’s  viewpoint  matched  the  Kinect’s  

Proper  occlusion  requires  a  more  complete  model  of  scene  objects  

Page 47: 426 Lecture 9: Research Directions in AR

Architecture 5. Gesture

•  Static Gestures • Dynamic Gestures •  Context based Gestures

4. Modeling

• Hand recognition/modeling •  Rigid-body modeling

3. Classification/Tracking

2. Segmentation

1. Hardware Interface

HITLabNZ’s Gesture Library

Page 48: 426 Lecture 9: Research Directions in AR

Architecture 5. Gesture

•  Static Gestures •  Dynamic Gestures •  Context based Gestures

4. Modeling

•  Hand recognition/modeling

•  Rigid-body modeling

3. Classification/Tracking

2. Segmentation

1. Hardware Interface

HITLabNZ’s Gesture Library

o  Supports PCL, OpenNI, OpenCV, and Kinect SDK. o  Provides access to depth, RGB, XYZRGB. o  Usage: Capturing color image, depth image and concatenated

point clouds from a single or multiple cameras o  For example:

Kinect for Xbox 360

Kinect for Windows

Asus Xtion Pro Live

Page 49: 426 Lecture 9: Research Directions in AR

Architecture 5. Gesture

•  Static Gestures •  Dynamic Gestures •  Context based Gestures

4. Modeling

•  Hand recognition/modeling

•  Rigid-body modeling

3. Classification/Tracking

2. Segmentation

1. Hardware Interface

o  Segment images and point clouds based on color, depth and space.

o  Usage: Segmenting images or point clouds using color models, depth, or spatial properties such as location, shape and size.

o  For example:

HITLabNZ’s Gesture Library

Skin color segmentation

Depth threshold

Page 50: 426 Lecture 9: Research Directions in AR

Architecture 5. Gesture

•  Static Gestures •  Dynamic Gestures •  Context based Gestures

4. Modeling

•  Hand recognition/modeling

•  Rigid-body modeling

3. Classification/Tracking

2. Segmentation

1. Hardware Interface

o  Identify and track objects between frames based on XYZRGB.

o  Usage: Identifying current position/orientation of the tracked object in space.

o  For example:

HITLabNZ’s Gesture Library

Training set of hand poses, colors represent unique regions of the hand.

Raw output (without-cleaning) classified on real hand input (depth image).

Page 51: 426 Lecture 9: Research Directions in AR

Architecture 5. Gesture

•  Static Gestures •  Dynamic Gestures •  Context based Gestures

4. Modeling

•  Hand recognition/modeling

•  Rigid-body modeling

3. Classification/Tracking

2. Segmentation

1. Hardware Interface

o  Hand Recognition/Modeling   Skeleton based (for low resolution

approximation)   Model based (for more accurate

representation) o  Object Modeling (identification and tracking rigid-

body objects) o  Physical Modeling (physical interaction)

  Sphere Proxy   Model based   Mesh based

o  Usage: For general spatial interaction in AR/VR environment

HITLabNZ’s Gesture Library

Page 52: 426 Lecture 9: Research Directions in AR

Method  Represent  models  as  collec#ons  of  spheres  moving  with  the  

models  in  the  Bullet  physics  engine  

Page 53: 426 Lecture 9: Research Directions in AR

Method  Render  AR  scene  with  OpenSceneGraph,  using  depth  map  

for  occlusion  

Shadows  yet  to  be  implemented  

Page 54: 426 Lecture 9: Research Directions in AR

Results

Page 55: 426 Lecture 9: Research Directions in AR

Architecture 5. Gesture

•  Static Gestures •  Dynamic Gestures •  Context based Gestures

4. Modeling

•  Hand recognition/modeling

•  Rigid-body modeling

3. Classification/Tracking

2. Segmentation

1. Hardware Interface

o  Static (hand pose recognition) o  Dynamic (meaningful movement recognition) o  Context-based gesture recognition (gestures with context,

e.g. pointing) o  Usage: Issuing commands/anticipating user intention and high

level interaction.

HITLabNZ’s Gesture Library

Page 56: 426 Lecture 9: Research Directions in AR

Multimodal Interaction

Page 57: 426 Lecture 9: Research Directions in AR

Multimodal Interaction   Combined speech input   Gesture and Speech complimentary

  Speech -  modal commands, quantities

 Gesture -  selection, motion, qualities

  Previous work found multimodal interfaces intuitive for 2D/3D graphics interaction

Page 58: 426 Lecture 9: Research Directions in AR

1. Marker Based Multimodal Interface

  Add speech recognition to VOMAR   Paddle + speech commands

Page 59: 426 Lecture 9: Research Directions in AR
Page 60: 426 Lecture 9: Research Directions in AR

Commands Recognized   Create Command "Make a blue chair": to create a virtual

object and place it on the paddle.   Duplicate Command "Copy this": to duplicate a virtual object

and place it on the paddle.   Grab Command "Grab table": to select a virtual object and

place it on the paddle.   Place Command "Place here": to place the attached object in

the workspace.   Move Command "Move the couch": to attach a virtual object

in the workspace to the paddle so that it follows the paddle movement.

Page 61: 426 Lecture 9: Research Directions in AR

System Architecture

Page 62: 426 Lecture 9: Research Directions in AR

Object Relationships

"Put chair behind the table” Where is behind?

View specific regions

Page 63: 426 Lecture 9: Research Directions in AR

User Evaluation   Performance time

  Speech + static paddle significantly faster

  Gesture-only condition less accurate for position/orientation   Users preferred speech + paddle input

Page 64: 426 Lecture 9: Research Directions in AR

Subjective Surveys

Page 65: 426 Lecture 9: Research Directions in AR

2. Free Hand Multimodal Input   Use free hand to interact with AR content   Recognize simple gestures   No marker tracking

Point Move Pick/Drop

Page 66: 426 Lecture 9: Research Directions in AR

Multimodal Architecture

Page 67: 426 Lecture 9: Research Directions in AR

Multimodal Fusion

Page 68: 426 Lecture 9: Research Directions in AR

Hand Occlusion

Page 69: 426 Lecture 9: Research Directions in AR

User Evaluation

  Change object shape, colour and position   Conditions

  Speech only, gesture only, multimodal

  Measure   performance time, error, subjective survey

Page 70: 426 Lecture 9: Research Directions in AR

Experimental Setup

Change object shape and colour

Page 71: 426 Lecture 9: Research Directions in AR

Results   Average performance time (MMI, speech fastest)

  Gesture: 15.44s   Speech: 12.38s   Multimodal: 11.78s

  No difference in user errors   User subjective survey

  Q1: How natural was it to manipulate the object? -  MMI, speech significantly better

  70% preferred MMI, 25% speech only, 5% gesture only

Page 72: 426 Lecture 9: Research Directions in AR

Intelligent Interfaces

Page 73: 426 Lecture 9: Research Directions in AR

Intelligent Interfaces   Most AR systems stupid

 Don’t recognize user behaviour  Don’t provide feedback  Don’t adapt to user

  Especially important for training   Scaffolded learning  Moving beyond check-lists of actions

Page 74: 426 Lecture 9: Research Directions in AR

Intelligent Interfaces

  AR interface + intelligent tutoring system   ASPIRE constraint based system (from UC)  Constraints

-  relevance cond., satisfaction cond., feedback

Page 75: 426 Lecture 9: Research Directions in AR

Domain Ontology

Page 76: 426 Lecture 9: Research Directions in AR

Intelligent Feedback

  Actively monitors user behaviour   Implicit vs. explicit interaction

  Provides corrective feedback

Page 77: 426 Lecture 9: Research Directions in AR
Page 78: 426 Lecture 9: Research Directions in AR

Evaluation Results   16 subjects, with and without ITS   Improved task completion

  Improved learning

Page 79: 426 Lecture 9: Research Directions in AR

Intelligent Agents   AR characters

  Virtual embodiment of system  Multimodal input/output

  Examples   AR Lego, Welbo, etc  Mr Virtuoso

-  AR character more real, more fun -  On-screen 3D and AR similar in usefulness

Page 80: 426 Lecture 9: Research Directions in AR

Context Sensing

Page 81: 426 Lecture 9: Research Directions in AR

Context Sensing   TKK Project   Using context to

manage information   Context from

  Speech  Gaze   Real world

  AR Display

Page 82: 426 Lecture 9: Research Directions in AR
Page 83: 426 Lecture 9: Research Directions in AR
Page 84: 426 Lecture 9: Research Directions in AR
Page 85: 426 Lecture 9: Research Directions in AR

Gaze Interaction

Page 86: 426 Lecture 9: Research Directions in AR

AR View

Page 87: 426 Lecture 9: Research Directions in AR

More Information Over Time

Page 88: 426 Lecture 9: Research Directions in AR

Experiences

Page 89: 426 Lecture 9: Research Directions in AR

Novel Experiences   Crossing Boundaries

 Ubiquitous VR/AR

  Collaborative Experiences   Massive AR

  AR + Social Networking

  Usability

Page 90: 426 Lecture 9: Research Directions in AR

Crossing Boundaries

Jun Rekimoto, Sony CSL

Page 91: 426 Lecture 9: Research Directions in AR

Invisible Interfaces

Jun Rekimoto, Sony CSL

Page 92: 426 Lecture 9: Research Directions in AR

Milgram’s Reality-Virtuality continuum

Mixed Reality

Reality - Virtuality (RV) Continuum

Real Environment

Augmented Reality (AR)

Augmented Virtuality (AV)

Virtual Environment

Page 93: 426 Lecture 9: Research Directions in AR

The MagicBook

Reality Virtuality Augmented Reality (AR)

Augmented Virtuality (AV)

Page 94: 426 Lecture 9: Research Directions in AR

Invisible Interfaces

Jun Rekimoto, Sony CSL

Page 95: 426 Lecture 9: Research Directions in AR

Example: Visualizing Sensor Networks

  Rauhala et. al. 2007 (Linkoping)   Network of Humidity Sensors

  ZigBee wireless communication

  Use Mobile AR to Visualize Humidity

Page 96: 426 Lecture 9: Research Directions in AR
Page 97: 426 Lecture 9: Research Directions in AR
Page 98: 426 Lecture 9: Research Directions in AR

Invisible Interfaces

Jun Rekimoto, Sony CSL

Page 99: 426 Lecture 9: Research Directions in AR

UbiVR – CAMAR

CAMAR Companion

CAMAR Viewer

CAMAR Controller

GIST - Korea

Page 100: 426 Lecture 9: Research Directions in AR

ubiHome @ GIST

©ubiHome

What/When/How

Where/When

Media services

Who/What/ When/How

ubiKey

Couch Sensor PDA

Tag-it

Door Sensor

ubiTrack

When/How When/How Who/What/When/How

Light service MR window

Page 101: 426 Lecture 9: Research Directions in AR

CAMAR - GIST

(CAMAR: Context-Aware Mobile Augmented Reality)

Page 102: 426 Lecture 9: Research Directions in AR

  UCAM: Architecture

Operating System

BAN/PAN (BT)

TCP/IP (Discovery,Control,Event)

Network Interface

Context Interface

Sensor Service (Integrator,Manager,

Interpreter,ServiceProvider)

Content

wear-UCAM

vr-UCAM

ubi-UCAM

Page 103: 426 Lecture 9: Research Directions in AR

Hybrid User Inerfaces Goal: To incorporate AR into normal meeting

environment   Physical Components

  Real props

  Display Elements   2D and 3D (AR) displays

  Interaction Metaphor   Use multiple tools – each relevant for the task

Page 104: 426 Lecture 9: Research Directions in AR

Hybrid User Interfaces

PERSONAL

1 TABLETOP

2 WHITEBOARD

3 MULTIGROUP

4

Private Display Private Display Group Display

Private Display Public Display

Private Display Group Display Public Display

Page 105: 426 Lecture 9: Research Directions in AR

Reality Virtual Reality

Terminal

Ubiquitous

Desktop AR VR

Milgram

Weiser

UbiComp

Mobile AR

Ubi AR

Ubi VR

From: Joe Newman

Page 106: 426 Lecture 9: Research Directions in AR

Reality

VR

Ubiquitous

Terminal

Milgram

Weiser

Single User

Massive Multi User

Page 107: 426 Lecture 9: Research Directions in AR

Remote Collaboration

Page 108: 426 Lecture 9: Research Directions in AR

AR Client

  HMD and HHD   Showing virtual images over real world   Images drawn by remote expert   Local interaction

Page 109: 426 Lecture 9: Research Directions in AR

Shared Visual Context (Fussell ,1999)

  Remote video collaboration   Shared manual, video viewing  Compared Video, Audio, Side-by-side collaboration  Communication analysis

Page 110: 426 Lecture 9: Research Directions in AR

WACL(Kurata,2004)

  Wearable Camera/Laser Pointer   Independent pointer control   Remote panorama view

Page 111: 426 Lecture 9: Research Directions in AR

WACL(Kurata,2004)

  Remote Expert View   Panorama viewing, annotation, image capture

Page 112: 426 Lecture 9: Research Directions in AR

As If Being There (Poelman, 2012)

  AR + Scene Capture  HMD viewing, remote expert  Gesture input   Scene capture (PTAM), stereo camera

Page 113: 426 Lecture 9: Research Directions in AR

As If Being There (Poelman, 2012)

  Gesture Interaction  Hand postures recognized  Menu superimposed on hands

Page 114: 426 Lecture 9: Research Directions in AR

Real World Capture

  Using Kinect for 3D Scene Capture  Camera tracking   AR overlay   Remote situational awareness

Page 115: 426 Lecture 9: Research Directions in AR

Remote scene capture with AR annotations added

Page 116: 426 Lecture 9: Research Directions in AR

Massive Multiuser   Handheld AR for the first time allows extremely high

numbers of AR users   Requires

  New types of applications/games   New infrastructure (server/client/peer-to-peer)   Content distribution…

Future Directions SLIDE 116

Page 117: 426 Lecture 9: Research Directions in AR

Massive MultiUser   2D Applications

  MSN – 29 million   Skype – 10 million   Facebook – 100m+

  3D/VR Applications   SecondLife > 50K   Stereo projection - <500

  Augmented Reality   Shared Space (1999) - 4   Invisible Train (2004) - 8

Page 118: 426 Lecture 9: Research Directions in AR

BASIC VIEW

Page 119: 426 Lecture 9: Research Directions in AR

PERSONAL VIEW

Page 120: 426 Lecture 9: Research Directions in AR

Augmented Reality 2.0 Infrastructure

Page 121: 426 Lecture 9: Research Directions in AR

Leveraging Web 2.0   Content retrieval using HTTP   XML encoded meta information

  KML placemarks + extensions   Queries

  Based on location (from GPS, image recognition)   Based on situation (barcode markers)

  Queries also deliver tracking feature databases   Everybody can set up an AR 2.0 server   Syndication:

  Community servers for end-user content   Tagging

  AR client subscribes to arbitrary number of feeds

Page 122: 426 Lecture 9: Research Directions in AR

Content   Content creation and delivery

 Content creation pipeline  Delivering previously unknown content

  Streaming of  Data (objects, multi-media)  Applications

  Distribution  How do users learn about all that content?  How do they access it?

Page 123: 426 Lecture 9: Research Directions in AR

ARML (AR Markup Language)

Page 124: 426 Lecture 9: Research Directions in AR

Scaling Up

  AR on a City Scale   Using mobile phone as ubiquitous sensor   MIT Senseable City Lab

  http://senseable.mit.edu/

Page 125: 426 Lecture 9: Research Directions in AR
Page 126: 426 Lecture 9: Research Directions in AR

WikiCity Rome (Senseable City Lab MIT)

Page 127: 426 Lecture 9: Research Directions in AR

Conclusions

Page 128: 426 Lecture 9: Research Directions in AR

AR Research in the HIT Lab NZ   Gesture interaction

  Gesture library

  Multimodal interaction   Collaborative speech/gesture interfaces

  Mobile AR interfaces   Outdoor AR, interaction methods, navigation tools

  AR authoring tools   Visual programming for AR

  Remote Collaboration   Mobile AR for remote interaction

Page 129: 426 Lecture 9: Research Directions in AR

More Information •  Mark Billinghurst

– [email protected]

•  Websites –  http://www.hitlabnz.org/ –  http://artoolkit.sourceforge.net/ –  http://www.osgart.org/ –  http://www.hitlabnz.org/wiki/buildAR/