The main camera of the device (the one with lots of megapixels) should be capable of capturing depth information at least in the same range as it can focus, with 30 FPS minimum, and depth resolution of at least half a milimeter.
The secondary camera should have similar capabilities, but with resolution and range reduced if necessary, but still allowing project Natal like functionality from across the room as well as close range head and eye tracking as well as face recognition and expression recognition.
Bonust points for photo and video recording software that fakes two camera stereo recording by displacing pixels based on the depth map; extra bonus for image processing to compensate for occlusi
The technology already exist, i dunno if it can reach the required performance though.