Kinect hacked, sliced and diced
Released in November 2010, Microsoft's Kinect has created a bow wave amongst the more technically minded communities. Most computing departments in Universities own one and they're doing very cool stuff with it. You will have all seen the long stream of videos under the tag "Kinect hacks" and you will have all seen it do some things it wasn't intended to do. In this post, I want to talk about why it's actually such a complicated device, and where it's going next. If you read this blog, you more than likely not only know what it is, but you know the gory technical details too. I won't spend too long on that as a result, but just in case this post goes "mainstream", lets introduce ourselves to the box before we talk about thinking outside of it:
Microsoft have introduced it like this:
"Kinect brings games and entertainment to life in extraordinary new ways without using a controller. Imagine controlling movies and music with the wave of a hand or the sound of your voice*. With Kinect, technology evaporates, letting the natural magic in all of us shine...Controller-free gaming means full body play. Kinect responds to how you move. So if you have to kick, then kick. If you have to jump, then jump. You already know how to play. All you have to do now is to get off the couch...Once you wave your hand to activate the sensor, your Kinect will be able to recognise you and access your Avatar. Then you’ll be able to jump in and out of different games, and show off and share your moves."
The technology isn't all Microsoft. It uses range camera technology which interprets 3D information from a continuously-projected infrared structured light. This was made by Israeli developer PrimeSense. The 3D scanner system uses a variant of image-based 3D reconstruction. Its motion and voice sensitive, make it a "Natural User Interface", because no controller is needed. The depth sensor has an infrared laser projector together with a monochrome CMOS sensor, which captures video data in 3D under any ambient light conditions. GamesRadar has a nice techy review of it that you might like too.
"Think Tom Cruise in Minority report" is another way I've heard it described. A big thank you to MIT for making that a reality, but it wasn't really made for this either. It was made to play Xbox games. Microsoft have however embraced the hacking of its newest offering and have made available an SDK (thank you for this MSFT).
There are other options to the SDK, and until very recently there wasn't one. Adafruit ran a competition for Kinect drivers and Hector Martin won, releasing the code available at OpenKinect. There's also a Google Group if you're interested and loads more support online. PrimeSense made available their motion tracking middleware NITE, which is worth noting. Kinect-Hacks is a great site if you want to get hands on.
Anyway...now we know what it is, back the point of this post.
It occurred to me how different the user experience has become. I don't have any insider information, but I reckon MSFT spent a small fortune in the user experience for the Kinect. I can well imagine how much testing much have been so that when a child places itself in front of it, all of the expected things happen. This makes for a really easy gameplay and allows someone to really "be the controller". There have been some really useful usability reviews such as the one by Jacob Nielsen, one by Steve Cable and the comparison by Sheryl Yu Lin. They show that although it's a good product, the overall opinion is "could do better". The proof is in the pudding though, and I reckon that most people would be able to:
- Grasp what they're supposed to do
- Understand what's happening
- Understand where it's happening
When we move to the land of Kinect hacks and away from the purpose built environment it was designed for, we lose 1, 2, and 3 for many people. Most of the hacks, be it the Shadow Puppets or the Optical Camouflage, are at proof of concept stage. They're fun and are a nice example of how you can "leave the box" and think outside of it. I haven't yet seen any hacks that are beyond proof of concept or that are genuinely useful. There are many reasons for this, but mainly, the technology is very new and so is the whole idea of hacking it. I reckon more interesting things will emerge later down the line.
What I'm finding interesting though is that once you remove the context (the Xbox), you are left with a lot of work to do for the non-geek to grasp what's going on. I ran a creative technology workshop recently with a group of business people. We looked at Kinect specifically, and after spending time understanding the technology, we looked at a series of hacks too. They were all amazed, and in awe like little kids at a candy store. I then asked them to get into small groups and come up with some ideas for hacks of their own. Bearing in mind that none of them were technical people, I was hoping to get some interesting insight into usage. Interestingly, what I found is that they struggled to come up with anything new. Having run quite a lot of these workshops before, with the same kinds of people, I was puzzled. Normally, they always come up with a few things I would never have thought of. More interestingly...they (all sort of) presented the same idea, and none of them noticed. Mostly thoughts revolved around using it to do something that is already possible without Kinect but with it. Seeing we are talking about a technology that allows you to speak to the computer and interact physically with a virtual environment, I was stunned. When the technology is then safely placed back into its Xbox environment, there is a lot less struggle with it as a technology or even as an idea.
MSFT are already investigating what comes after Kinect:
For a long time we investigated (or obsessed about) putting ourselves into a virtual world. We did this with avatars, through games, through virtual worlds and so on. Now, we've flipped it on its head and we're more interested in integrating the virtual into the real world. This changes the whole user experience and I don't think we've given it a lot of thought as yet. I'd be interested in hearing about any projects or testing happening in relation to this.
In the meantime, check out this very cool video, taking us further down those futuristic paths: