6
Kinect SoundLab Controller Iain Laird August 2011 Introduction The Microsoft Kinect Controller has been an incredibly popular subject on 'Hacker' forums in the last year since its release. A bounty on the website adafruit.com was advertised which gave a cash prize to the frst person to develop and publish open source drivers for the device. Since it was won, a vast amount of application have included it within their projects; some of which were practical utility applications (like basic 3D point cloud mapping) and others were more creative (digital hand puppetry and game sprite control). The author has published a review on initial experiences with the controller and will expand on those fndings here, the focus being on using the Kinect Controller in the SoundLab. Background Previous work by the author touched upon the use of OpenNI and skeletal mapping with the Kinect, allowing the controller to track in three dimensions, the limbs of a participant positioned in front of it. The main aim of this activity was to extract some meaningful data from this type of system and use it as a control parameter in MaxMSP. Specifcally, the author hoped to pan ambisonic sounds around a SoundLab by pointing to where they should be positioned. Jit.freenect.grab + OpenCV style processing Figure 1: jit.freenect.grab in Max MSP, a collection of patches written by Blair Neale The Jit.freenect.grab Max object released by J.M Peltier along with his OpenCV objects provide a fantastic way of interacting with the Kinect Controller in MaxMSP. The jit.freenect.grab object creates a jitter movie matrix which can then be processed in numerous different ways. The OpenCV objects can be used with the kinect to recognise and track 'blobs' caused by a users presence in front of the controller. This kind of system allows an interesting way of interacting with menus and on screen controls in a similar way to multitouch screens without the presence of a touchscreen surface. This method does not track 3D coordinates but instead uses the depth data provided by the connect to give 'boundaries' of interaction in a space in front of the user. This method enables fner gestural analysis and control i.e. tracking individual fngers etc.

Kinect SoundLab Controller Iain Laird August 2011 ...The Jit.freenect.grab Max object released by J.M Peltier along with his OpenCV objects provide a fantastic way of interacting with

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Kinect SoundLab Controller Iain Laird August 2011 ...The Jit.freenect.grab Max object released by J.M Peltier along with his OpenCV objects provide a fantastic way of interacting with

Kinect SoundLab ControllerIain Laird August 2011

IntroductionThe Microsoft Kinect Controller has been an incredibly popular subject on 'Hacker' forums in the last year since its release. A bounty on the website adafruit.com was advertised which gave a cash prize to the frst person to develop and publish open source drivers for the device. Since it was won, a vast amount of application have included it within their projects; some of which were practical utility applications (like basic 3D point cloud mapping) and others were more creative (digital hand puppetry and game sprite control). The author has published a review on initial experiences with the controller and will expand on those fndings here, the focus being on using the Kinect Controller in the SoundLab.

BackgroundPrevious work by the author touched upon the use of OpenNI and skeletal mapping with the Kinect, allowing the controller to track in three dimensions, the limbs of a participant positioned in front of it. The main aim of this activity was to extract some meaningful data from this type of system and use it as a control parameter in MaxMSP. Specifcally, the author hoped to pan ambisonic sounds around a SoundLab by pointing to where they should be positioned.

Jit.freenect.grab + OpenCV style processing

Figure 1: jit.freenect.grab in Max MSP, a collection of patches written by Blair Neale

The Jit.freenect.grab Max object released by J.M Peltier along with his OpenCV objects provide a fantastic way of interacting with the Kinect Controller in MaxMSP. The jit.freenect.grab object creates a jitter movie matrix which can then be processed in numerous different ways. The OpenCV objects can be used with the kinect to recognise and track 'blobs' caused by a users presence in front of the controller. This kind of system allows an interesting way of interacting with menus and on screen controls in a similar way to multitouch screens without the presence of a touchscreen surface. This method does not track 3D coordinates but instead uses the depth data provided by the connect to give 'boundaries' of interaction in a space in front of the user. This method enables fner gestural analysis and control i.e. tracking individual fngers etc.

Page 2: Kinect SoundLab Controller Iain Laird August 2011 ...The Jit.freenect.grab Max object released by J.M Peltier along with his OpenCV objects provide a fantastic way of interacting with

OpenNI + Primesense style processingThe previous report on the Kinect touched on the 'FAAST' application which uses OpenNI to map a skeleton onto a person in front of the controller which can then be tracked in three dimensions. The person stands in a standard 'psi' pose (referring to the greek letter - ) which allows the separate limbs and joints to be discerned by the software and once calibrated can be tracked over time.

Figure 2: An example of FAAST working with the Kinect

This activity uncovered a similar command line application called 'osceleton' which provides a similar level of processing, taking kinect skeleton data from the OpenNI framework and streams 3D coordinates of the skeleton's joints using OpenSoundControl messages. Max MSP has extended support for OSC and once the data reaches the patch it is straightforward to map the data to some meaningful parameter such as volume, flter cut off frequency or azimuth and elevation of an ambisonic sound source.

The OpenSoundControl data format has been reported on previously by the author for a number of different applications and has recently proved it's worth being used to control Max patches from an iPad using TouchOSC. The details of OSC will not be discussed here however it can be briefy described as an advanced MIDI-like system which allows control of devices/software connected to a network.

Page 3: Kinect SoundLab Controller Iain Laird August 2011 ...The Jit.freenect.grab Max object released by J.M Peltier along with his OpenCV objects provide a fantastic way of interacting with

Method+installationThe installation of this system cannot truly be credited to the author as an extremely useful installation procedure has been published on the internet by Tohm Judson. This procedure assumes no command line experience at all making it extremely accessible with all the elements of the installation being completely open source. The only cautionary note is that this installation requires certain pieces of software to be of very specifc versions. If they become unavailable they may need to be altered in order to function correctly.

Figure 3: Osceleton running with the Stickmaetic processing sketch (screen shot obtained from internet)

There are two software outputs possible from this process. The frst is a strictly visual application which takes the coordinates of the joints and draws a skeleton in the program 'Processing' The skeleton mirrors the users movements and allows them to 'catch' falling balls on screen. The second output is a MaxMSP patch which routes all the incoming OSC data to separate number boxes on screen.

With all the elements in place and running in the MaxMSP patch it is diffcult to see if the numbers are truly refecting the users movements. Therefore a test was performed with the processing sketch which would only draw the skeleton correctly if every part of the system was functioning as expected. The processing sketch required a few lines of code be changed in order to run correctly but this was trivial after a brief google search of the problem.

Page 4: Kinect SoundLab Controller Iain Laird August 2011 ...The Jit.freenect.grab Max object released by J.M Peltier along with his OpenCV objects provide a fantastic way of interacting with

Additional Max processingAfter being reassured that all the elements were indeed running as expected, a Max patch was required in order to process the OSC data to arrive at the azimuth and elevation of both the users arms. This data would then be passed to a 1st order ambisonic encoder to pan the sound source to the correct location.

This was achieved by tracking the users shoulders and hands and obtaining the 3D coordinate of the hands relative to the shoulders. By using standard trigonometry it was relatively straightforward to arrive at the azimuth and elevation of the hands. For the purposes of this test the distance of the sound source was set to be slightly outwith the radius of the virtual array. No parameter was mapped to the distance between the hands and the shoulders for this activity but could indeed be mapped for more specifc applications.

The initial processing gives the distance of the hand from the shoulder in x and y distances. Obtaining this relative distance as opposed to an absolute distance from the receiver simplifes the processing and also allows the user to be positioned at any location in front of the sensor rather than at a set distance or angle from the sensor.

Rather than use the trig objects or expressions in max, a single object was used to convert the cartesian coordinates of the joints to the polar coordinates required by the panner. This object is called the cartopol object and does exactly as described, converting x,y values to magnitude and angle, It was necessary to rotate the 0 degree axis to arrive at values which truly refected the direction of the arms.

CaveatsThe software used was found to be installable on a Mac but not on Windows which meant it was not possible to try in the Glasgow SoundLab itself. The system was set up on the authors Macbook and used a two channel ambisonic decoder which fed headphones worn by the user. The system as it stands could easily work in the SoundLab once a Mac is obtained or alternatively by outputting the resulting B-format audio stream and sending to the SoundLab's soundcard inputs.

Page 5: Kinect SoundLab Controller Iain Laird August 2011 ...The Jit.freenect.grab Max object released by J.M Peltier along with his OpenCV objects provide a fantastic way of interacting with

ResultsThe end result panned two square wave sound sources linked to the position of each hand with a graphical output showing where the encoder was being 'told' to place each sound source. The results were remarkably stable and reactive and very fun to use. As the system was tested only over headphones, the best results were achieved using azimuth only controls, however a 3D system will be tested when the equipment becomes available. When the system became confused sound sources were suddenly panned randomly all over the place, therefore it would be best to provide another gesture as a volume control to stop this from happening. Further work is needed to prove the position of the source is truly refective of where the user thinks they are pointing.

Figure 4: Screengrab of system running with image of test subject. Notice the position of the sources (marked in red and blue on the patch) and the position of the users hands.

The system has been tested with a range of gestures. Panning sounds to the rear worked surprisingly well by positioning the hands behind the back, giving complete 360 degree range. Crossing arms over was also recognised by the sensor. The software seemed to understand when the user had turned to face the opposite direction but did encounter a moment of confusion when the user was facing at 90 degrees to the sensor.

Evaluation As mentioned previously the system was very easy to set up and use and provided very good control of the sound sources. The need for user calibration is a little tedious and limits the system slightly as users cannot just 'happen upon' and use it straight away. They need to follow some instructions before it will work properly.

Page 6: Kinect SoundLab Controller Iain Laird August 2011 ...The Jit.freenect.grab Max object released by J.M Peltier along with his OpenCV objects provide a fantastic way of interacting with

Future WorkThis fun application could potentially have some use in spatial audio performance or installations however with some careful thought and better processing could provide an alternative interface to interacting with demonstration patches for instance moving through slides using 'swipe' gestures or controlling 3D models using push and pull gestures. The system is rather gimmicky but with some careful thought could provide an interesting way of interacting with particular types of presentation. Careful programming alongside the multisource demo could give some degree of control over individual elements of the orchestra (mainly level at this stage) giving the possibility of 'virtual conducting'.

More immediate future work will involve refning the mapping of data to gestures to provide a more interesting control of the SoundLab. The most interesting possibility is the idea of 'painting' the 3D space around the listener with layers of sound which can be turned off and on and positioned only with gestures. Furthermore, Osceleton can support multiple users so future work will test how well the system handles this. Obtaining some level of utility feedback from Osceleton would be very useful to show which users are successfully calibrated.

References

OpenNI to Max/MSP via OSC, Tohm Judson http://tohmjudson.com/?p=30 Accessed August 2011