Location
Wang Center Theater
Event Description

Jitendra Malik
UC Berkeley
Time: Friday, March 8th, 2013, 2:30 pm
Location: Wang Center

The Three R's of Computer Vision: Recognition, Reconstruction and Reorganization

Over the last two decades, we have seen remarkable progress in computer vision with demonstration of capabilities such as face detection, handwritten digit recognition, reconstructing three-dimensional models of cities, automated monitoring of activities, segmenting out organs or tissues in biological images, and sensing for control of robots and cars. Yet there are many problems where computers still perform significantly below human perception.

For example, in the recent PASCAL benchmark challenge on visual object detection, the average precision for most 3D object categories was under 50%.

I will argue that further progress on the classic problems of computational vision: recognition, reconstruction and re-organization requires us to study the interaction among these processes. For example recognition of 3d objects benefits from a preliminary reconstruction of 3d structure, instead of just treating it as a 2d pattern classification problem. Recognition is also reciprocally linked to reorganization, with bottom up grouping processes generating candidates, which combine with top-down activations of object and part detectors. In this talk, I will show some of the progress we have made towards the goal of a unified framework for the 3R¹s of computer vision. I will also point towards some of the exciting applications we may expect over the next decade as computer vision starts to deliver on even more of its grand promise.