Professor Tamara Berg receives Google Faculty Research Award


Nearly every blog or social network utilizes a combination of images, text, and other modalities (e.g. location) to convey information. Sometimes pictures are even the main or only source of content on a social site. For example, on the structure of the social network itself is predominantly defined through photographs and social interactions around these images. On other sites such as photography blogs, the pictures themselves are the medium of information (e.g. Even more often, pictures are used in combination with text to convey information. For example, on travel blogs multi-modal media are used to describe the experience, look, and feel of places (e.g. Street fashion blogs (e.g. use pictures to describe and exemplify ever evolving style trends.

Despite the underlying multi-modal nature of online social communities, most blog and social network analysis has focused on either understanding the structure of social networks, or using text processing techniques to access content. Very little research has been done on combining image analysis techniques with blog or network analysis. This is perhaps because computer vision is a very challenging problem and the results of automatic computer vision techniques are often extremely noisy. However, for specific settings or more constrained visual recognition problems, visual analysis may be feasible and useful, e.g., for clothing recognition or trend analysis.

The focus of this Google award will be to begin exploring what visual recognition algorithms can help reveal about social network structure, individual relationships, networks of influence, and characteristic patterns of interaction. Possible applications include improved prediction of individual decisions and preferences, discovery of latent patterns, and identification of emerging trends. All are exciting in terms of understanding social behavior and potentially relevant for e-commerce applications such as product search or suggestion.