The software and the Tracking of Kinect
The depth sensing hardware is the first feature that makes the Kinect special but without software it is just so much clever optics and processing. When Microsoft first released the Kinect it was very much Xbox and Microsoft applications only. It didn’t take long for the USB connection to be decoded and open source USB drivers appeared on the web. With these you could connect the Kinect to a Linux or Windows machine and access the raw data.
Only a few weeks later developing for Kinect became easier thanks to an open source initiative setup by PrimeSense the original designers of the Kinect hardware. All of the necessary APIs available as a project called OpenNI with drivers for Windows and Ubuntu.
The only problem is that the drivers that they supply are for their own reference hardware. However changing the driver is a matter of modifying configuration files and this has been done for the Kinect and this is explained
So how does Kinect track people?
There are two answers to this question. The old way and the new way designed by Microsoft Reasearch.
The new Microsoft way of doing things has significant advantages but at the moment only Microsoft have access to it. It has been promised that an SDK for the Kinect will be made available in the future and this might includes some “middleware” that does body tracking – or it might not.
Currently the only body tracking software that is available for general use is NITE from PrimeSense. This isn’t open source but you can use it via a licence key that they provide at the OpenNI website. This works in the way that all body tracking software worked until Microsoft found a better way so lets look briefly at its principles of operation.
The Nite software takes the raw depth map and finds a skeleton position that shows the body position of the subject. It does this by performing a segmentation of the depth map into objects and then tracks the objects as they move from frame-to-frame..
This is achieved by construct an avatar, i.e a model of the body that is being detected, and attempting to find a match in the data provided by the depth camera. Tracking is a matter of updating the match by moving the avatar as the data changes.
This was the basis of the first Kinect software that Microsoft tried out and it didn’t work well enough for a commercial product. After about a minute or so it tended to lose the track and then not be able to recover it. It also had the problem that it only worked for people whe were the same size and shape as the system’s developer – because that was the size and shape of the avatar used for matching.