Using the Kinect for Fun and Profit by Tam Hanna

Preview:

DESCRIPTION

Very few devices offer as fascinating features as the Microsoft Kinect. This seminar teaches you what the Kinect can do and how you can develop for it. Attendants are recommended to bring a notebook with Visual C# 2010 express edition and the latest Kinect SDK so that they can fully profit from the talk. A sensor will be available for testing own applications.

Citation preview

Using the Kinect

for fun

and profit

About /me

• Tam HANNA– Director,

Tamoggemon Holding k,s

– Runs web sites about mobile computing

– Writes scientific books

Agenda

• Kinect – what is that?

• Streams

• Skeletons

• Facial tracking

• libfreenect

• OpenNI

Slide download

• http://www.tamoggemon.com/test/ Codemotion-Kinect.ppt

• URL IS case sensitive

Kinect – what is that?

History - I

• Depth: PrimeSense technology– Not from Redmond

• First public mention: 2007– Bill Gates, D3 conference– „Camera for game control“

Contrast detection

Where does the shirt end?

Dot matrix

Shadows / dead areas

Shadows / dead areas - II

History - II

• 2008: Wii ships– Best-selling console of its generation

• 2009: E3 conference– Announcement of „Project Natal“

• 2010: no CPU in sensor– Takes 10% of XBox 360 CPU

History - III

• 4. November 2010– First shipment– “We will sue anyone who reverse engineers“

• June 2011– Official SDK

System overview

Kinect provides

• Video stream

• Depth stream– (IR stream)

• Accelerometer data

• Rest: computedRest: computed

Family tree

• Kinect for XBOX– Normal USB

• Kinect bundle– MS-Fucked USB– Needs PSU

• Kinect for Windows– Costs more– Legal to deploy

Cheap from China

Streams

Kinect provides „streams“

• Repeatedly updated bitmaps

• Push or Pull processes possible– Attention: processing time!!!

Color stream

• Two modes– VGA@30fps– 1280x960@12fps

• Simple data format– 8 bits / component– R / G / B / A components

Depth stream

• Two modes– Unlimited range– Reduced range, with player indexing

Depth stream - II

• 16bit words

• Special encoding for limited range:

Tiefe[12]

Tiefe[11]

Tiefe[10]

Tiefe[9]

Tiefe[8]

Tiefe[7]

Tiefe[6]

Tiefe[5]

Tiefe[4]

Tiefe[3]

Tiefe[2]

Tiefe[1]

Tiefe[0]

Spieler[2]

Spieler[1]

Spieler[0]

Depth stream - III

IR stream

• Instead of color data

• 640x480@30fps

• 16bit words

• IR data in 10 MSB bits

Finding the Kinect

• SDK supports multiple Sensors/PC

• Find one

• Microsoft.Kinect.Toolkit

XAML part<Window x:Class="KinectWPFD2.MainWindow" xmlns:toolkit="clr-

namespace:Microsoft.Kinect.Toolkit;assembly=Microsoft.Kinect.Toolkit" xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml" Title="MainWindow" Height="759" Width="704"> <Grid> <Image Height="480" HorizontalAlignment="Left" Name="image1"

Stretch="Fill" VerticalAlignment="Top" Width="640" /> <toolkit:KinectSensorChooserUI x:Name="SensorChooserUI"

IsListening="True" HorizontalAlignment="Center" VerticalAlignment="Top" />

<CheckBox Content="Overlay rendern" Height="16" HorizontalAlignment="Left" Margin="267,500,0,0" Name="ChkRender" VerticalAlignment="Top" />

</Grid></Window>

Code - I public partial class MainWindow : Window { KinectSensor mySensor;

KinectSensorChooser myChooser;

public MainWindow() { InitializeComponent();

myChooser = new KinectSensorChooser(); myChooser.KinectChanged += new

EventHandler<KinectChangedEventArgs>(myChooser_KinectChanged); this.SensorChooserUI.KinectSensorChooser = myChooser; myChooser.Start();

Code - II void myChooser_KinectChanged(object sender,

KinectChangedEventArgs e) { if (null != e.OldSensor) {

if (mySensor != null) { mySensor.Dispose(); } }

if (null != e.NewSensor) { mySensor = e.NewSensor;

Initialize streammySensor.DepthStream.Enable(DepthImageFormat.Resolution640x480Fps30);mySensor.ColorStream.Enable(ColorImageFormat.RgbResolution640x480Fps30);myArray = new short[this.mySensor.DepthStream.FramePixelDataLength];myColorArray = new byte[this.mySensor.ColorStream.FramePixelDataLength];mySensor.AllFramesReady += new

EventHandler<AllFramesReadyEventArgs>(mySensor_AllFramesReady); try { this.mySensor.Start(); SensorChooserUI.Visibility = Visibility.Hidden; }

Process stream

void mySensor_AllFramesReady(object sender, AllFramesReadyEventArgs e)

{ ColorImageFrame c = e.OpenColorImageFrame(); DepthImageFrame d = e.OpenDepthImageFrame();

if (c == null || d == null) return;

c.CopyPixelDataTo(myColorArray); d.CopyPixelDataTo(myArray);

Problem: Calibration

• Depth and Color sensors are not aligned

• Position of data in array does not match

Solution

• CoordinateMapper class

• Maps between various frame types– Depth and Color– Skeleton and Color

On Push mode

• Kinect can push data to application

• Preferred mode of operation

• But: sensitive to proc time

• If handler takes too long -> App stops

Skeletons

What is tracked?

• Data format– Real life coordinates

• Color-Mappable

Initialize stream

if (null != e.NewSensor)

{

mySensor = e.NewSensor; mySensor.SkeletonStream.Enable();

Get joints void mySensor_AllFramesReady(object sender, AllFramesReadyEventArgs e) { ColorImageFrame c = e.OpenColorImageFrame(); SkeletonFrame s = e.OpenSkeletonFrame();

if (c == null || s == null) return;

c.CopyPixelDataTo(myColorArray); s.CopySkeletonDataTo(mySkeletonArray);

foreach (Skeleton aSkeleton in mySkeletonArray) {

DrawBone(aSkeleton.Joints[JointType.HandLeft], aSkeleton.Joints[JointType.WristLeft], armPen, drawingContext);

Use joints private void DrawBone(Joint jointFrom, Joint jointTo, Pen aPen,

DrawingContext aContext) { if (jointFrom.TrackingState == JointTrackingState.NotTracked || jointTo.TrackingState == JointTrackingState.NotTracked) {}

if (jointFrom.TrackingState == JointTrackingState.Inferred || jointTo.TrackingState == JointTrackingState.Inferred) { ColorImagePoint p1 =

mySensor.CoordinateMapper.MapSkeletonPointToColorPoint(jointFrom.Position, ColorImageFormat.RgbResolution640x480Fps30);

} if (jointFrom.TrackingState == JointTrackingState.Tracked || jointTo.TrackingState == JointTrackingState.Tracked)

Facial trackingFacial tracking

What is tracked - I

What is tracked - II

What is tracked - III

AU‘s?

• Research by Paul EKMAN

• Quantify facial motion

Structure

• C++ library with algorithms

• Basic .net wrapper provided– Incomplete– Might change!!

Initialize face tracker

myFaceTracker = new FaceTracker(mySensor);

Feed face tracker FaceTrackFrame myFrame = null; foreach (Skeleton aSkeleton in mySkeletonArray) { if (aSkeleton.TrackingState == SkeletonTrackingState.Tracked) { myFrame =

myFaceTracker.Track(ColorImageFormat.RgbResolution640x480Fps30, myColorArray, DepthImageFormat.Resolution640x480Fps30, myArray, aSkeleton);

if (myFrame.TrackSuccessful == true) { break; } } }

Calibration

• OUCH!– Not all snouts are equal

• Maximums vary

libfreenect

What is it

• Result of Kinect hacking competition

• Bundled with most Linux distributions

• „Basic Kinect data parser“

Set-up

• /etc/udev/rules.d/66-kinect.rules

#Rules for Kinect ####################################################SYSFS{idVendor}=="045e", SYSFS{idProduct}=="02ae", MODE="0660",GROUP="video"SYSFS{idVendor}=="045e", SYSFS{idProduct}=="02ad", MODE="0660",GROUP="video"SYSFS{idVendor}=="045e", SYSFS{idProduct}=="02b0", MODE="0660",GROUP="video"### END #############################################################

Set-up II

• sudo adduser $USER plugdev

• sudo usermod -a -G video tamhan

• tamhan@tamhan-X360:~$ freenect-glviewKinect camera test

Number of devices found: 1

Could not claim interface on camera: -6

Could not open device

Set-up III

Problems

• gspca-kinect– Kernel module, uses Kinect as webcam– Blocks other libraries– sudo modprobe -r gspca_kinect

• Outdated version widely deployed– API not compatible

Update library

• sudo foo

• sudo add-apt-repository ppa:floe/libtisch

• sudo apt-get update

• sudo apt-get install libfreenect libfreenect-dev libfreenect-demos

libfreenect - II

color stream

Implementing it

• libfreenect: C++ library

• Question: which framework

• Answer: Qt ( what else ;) )

The .pro file

QT += core gui

TARGET = QtDepthFrame

CONFIG += i386

DEFINES += USE_FREENECT

LIBS += -lfreenect

The freenect thread

• Library needs processing time– Does not multithread itself

• Should be provided outside of main app

class QFreenectThread : public QThread{ Q_OBJECTpublic: explicit QFreenectThread(QObject *parent = 0); void run();

signals:

public slots:

public: bool myActive; freenect_context *myContext;};

QFreenectThread::QFreenectThread(QObject *parent) : QThread(parent){}

void QFreenectThread::run(){ while(myActive) { if(freenect_process_events(myContext) < 0) { qDebug("Cannot process events!"); QApplication::exit(1); } }}

QFreenect

• Main engine module– Contact point between Kinect and app

• Fires off signals on frame availability

• class QFreenect : public QObject• {• Q_OBJECT• public:• explicit QFreenect(QObject *parent = 0);• ~QFreenect();• void processVideo(void *myVideo, uint32_t myTimestamp=0);• void processDepth(void *myDepth, uint32_t myTimestamp=0);

• signals:• void videoDataReady(uint8_t* myRGBBuffer);• void depthDataReady(uint16_t* myDepthBuffer);

• public slots:

• private:• freenect_context *myContext;• freenect_device *myDevice;• QFreenectThread *myWorker;• uint8_t* myRGBBuffer;• uint16_t* myDepthBuffer;• QMutex* myMutex;

• public:• bool myWantDataFlag;• bool myFlagFrameTaken;• bool myFlagDFrameTaken;• static QFreenect* mySelf;• };

Some C++

QFreenect* QFreenect::mySelf;

static inline void videoCallback(freenect_device *myDevice, void *myVideo, uint32_t myTimestamp=0)

{ QFreenect::mySelf->processVideo(myVideo, myTimestamp);}

static inline void depthCallback(freenect_device *myDevice, void *myVideo, uint32_t myTimestamp=0)

{ QFreenect::mySelf->processDepth(myVideo, myTimestamp);}

Bring-up• QFreenect::QFreenect(QObject *parent) :• QObject(parent)• {• myMutex=NULL;• myRGBBuffer=NULL;

• myMutex=new QMutex();• myWantDataFlag=false;• myFlagFrameTaken=true;• mySelf=this;

• if (freenect_init(&myContext, NULL) < 0)• {• qDebug("init failed");• QApplication::exit(1);• }

Bring-up – II• freenect_set_log_level(myContext, FREENECT_LOG_FATAL);

• int nr_devices = freenect_num_devices (myContext);• if (nr_devices < 1)• {• freenect_shutdown(myContext);• qDebug("No Kinect found!");• QApplication::exit(1);• }

• if (freenect_open_device(myContext, &myDevice, 0) < 0)• {• qDebug("Open Device Failed!");• freenect_shutdown(myContext);• QApplication::exit(1);• }

• myRGBBuffer = (uint8_t*)malloc(640*480*3);• freenect_set_video_callback(myDevice,

videoCallback);• freenect_set_video_buffer(myDevice,

myRGBBuffer);• freenect_frame_mode vFrame =

freenect_find_video_mode(FREENECT_RESOLUTION_MEDIUM,FREENECT_VIDEO_RGB);

• freenect_set_video_mode(myDevice,vFrame);• freenect_start_video(myDevice);

• myWorker=new QFreenectThread(this);

• myWorker->myActive=true;

• myWorker->myContext=myContext;

• myWorker->start();

Shut-Down

• QFreenect::~QFreenect()• {• freenect_close_device(myDevice);• freenect_shutdown(myContext);• if(myRGBBuffer!=NULL)free(myRGBBuffer);• if(myMutex!=NULL)delete myMutex;• }

Data passingvoid QFreenect::processVideo(void *myVideo, uint32_t

myTimestamp){ QMutexLocker locker(myMutex); if(myWantDataFlag && myFlagFrameTaken) { uint8_t* mySecondBuffer=(uint8_t*)malloc(640*480*3); memcpy(mySecondBuffer,myVideo,640*480*3); myFlagFrameTaken=false; emit videoDataReady(mySecondBuffer); }}

Format of data word

• Array of bytes

• Three bytes = one pixel

Format of data word - II

for(int x=2; x<640;x++) { for(int y=0;y<480;y++) { r=(myRGBBuffer[3*(x+y*640)+0]); g=(myRGBBuffer[3*(x+y*640)+1]); b=(myRGBBuffer[3*(x+y*640)+2]); myVideoImage->setPixel(x,y,qRgb(r,g,b)); } }

libfreenect - III

depth stream

Extra bring-up

myDepthBuffer= (uint16_t*)malloc(640*480*2);freenect_set_depth_callback(myDevice,

depthCallback);freenect_set_depth_buffer(myDevice,

myDepthBuffer);freenect_frame_mode aFrame =

freenect_find_depth_mode( FREENECT_RESOLUTION_MEDIUM, FREENECT_DEPTH_REGISTERED);

freenect_set_depth_mode(myDevice,aFrame);freenect_start_depth(myDevice);

Extra processingvoid QFreenect::processDepth(void *myDepth, uint32_t

myTimestamp){ QMutexLocker locker(myMutex); if(myWantDataFlag && myFlagDFrameTaken) { uint16_t* mySecondBuffer=(uint16_t*)malloc(640*480*2); memcpy(mySecondBuffer,myDepth,640*480*2); myFlagDFrameTaken=false; emit depthDataReady(mySecondBuffer); }}

Data extraction

void MainWindow::depthDataReady(uint16_t* myDepthBuffer)

{ if(myDepthImage!=NULL)delete myDepthImage; myDepthImage=new

QImage(640,480,QImage::Format_RGB32); unsigned char r, g, b; for(int x=2; x<640;x++) { for(int y=0;y<480;y++) { int calcval=(myDepthBuffer[(x+y*640)]);

Data is in meters if(calcval==FREENECT_DEPTH_MM_NO_VALUE) { r=255; g=0;b=0; } else if(calcval>1000 && calcval < 2000) { QRgb aVal=myVideoImage->pixel(x,y); r=qRed(aVal); g=qGreen(aVal); b=qBlue(aVal); } else { r=0;g=0;b=0; } myDepthImage->setPixel(x,y,qRgb(r,g,b));

Example

OpenNI

What is OpenNI?

• Open standard for Natural Interfaces– Very Asus-Centric

• Provides generic NI framework

• VERY complex APIVERY complex API

Version 1.5 vs Version 2.0

Supported platforms

• Linux

• Windows– 32bit only

Want more?

• Book– German language– 30 Euros

• Launch– When it‘s done!

?!?

tamhan@tamoggemon.com@tamhanna

Images: pedroserafin, mattbuck